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For Whom Is This Book Written? 


Crow’s Law: Do not think what you want to think until you know what you 
ought to know.! 


Linear algebra is a living, active branch of mathematical research which is central 
to almost all other areas of mathematics and which has important applications in all 
branches of the physical and social sciences and in engineering. However, in recent 
years the content of linear algebra courses required to complete an undergraduate 
degree in mathematics—and even more so in other areas—at all but the most ded- 
icated universities, has been depleted to the extent that it falls far short of what is 
in fact needed for graduate study and research or for real-world application. This 
is true not only in the areas of theoretical work but also in the areas of computa- 
tional matrix theory, which are becoming more and more important to the working 
researcher as personal computers become a common and powerful tool. Students 
are not only less able to formulate or even follow mathematical proofs, they are also 
less able to understand the underlying mathematics of the numerical algorithms they 
must use. The resulting knowledge gap has led to frustration and recrimination on 
the part of both students and faculty alike, with each silently—and sometimes not 
so silently—blaming the other for the resulting state of affairs. This book is written 
with the intention of bridging that gap. It was designed be used in one or more of 
several possible ways: 
(1) As a self-study guide; 
(2) As a textbook for a course in advanced linear algebra, either at the upper-class 
undergraduate level or at the first-year graduate level; or 
(3) As areference book. 
It is also designed to be used to prepare for the linear algebra portion of prelim 
exams or Ph.D. qualifying exams. 

This volume is self-contained to the extent that it does not assume any previ- 
ous knowledge of formal linear algebra, though the reader is assumed to have been 
exposed, at least informally, to some basic ideas or techniques, such as matrix ma- 
nipulation and the solution of a small system of linear equations. It does, however, 


'This law, attributed to John Crow of King’s College, London, is quoted by R.V. Jones in his book 
Most Secret War, Wordsworth, 1998 (ISBN 978-1853266997). 
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assume a Seriousness of purpose, considerable motivation, and modicum of mathe- 
matical sophistication on the part of the reader. 

The theoretical constructions presented here are illustrated with a large number of 
examples taken from various areas of pure and applied mathematics. As in any area 
of mathematics, theory and concrete examples must go hand in hand and need to be 
studied together. As the German philosopher Immanuel Kant famously remarked, 
concepts without precepts are empty, whereas precepts without concepts are blind. 

The book also contains a large number of exercises, many of which are quite 
challenging, which I have come across or thought up in over thirty years of teaching. 
Many of these exercises have appeared in print before, in such journals as Ameri- 
can Mathematical Monthly, College Mathematics Journal, Mathematical Gazette, 
or Mathematics Magazine, in various mathematics competitions or circulated prob- 
lem collections, or even on the internet. Some were donated to me by colleagues 
and even students, and some originated in files of old exams at various universities 
which I have visited in the course of my career. Since, over the years, I did not keep 
track of their sources, all I can do is offer a collective acknowledgement to all those 
to whom it is due. Good problem formulators, like the God of the abbot of Citeaux, 
know their own. Deliberately, difficult exercises are not marked with an asterisk or 
other symbol. Solving exercises is an integral part of learning mathematics and the 
reader is definitely expected to do so, especially when the book is used for self- 
study. Try them all and remember the “grook” penned by the Danish genius Piet 
Hein: Problems worthy of attack / Prove their worth by hitting back. 

Solving a problem using theoretical mathematics is often very different from 
solving it computationally, and so strong emphasis is placed on the interplay of the- 
oretical and computational results. Real-life implementation of theoretical results 
is perpetually plagued by errors: errors in modeling, errors in data acquisition and 
recording, and errors in the computational process itself due to roundoff and trun- 
cation. There are further constraints imposed by limitations in time and memory 
available for computation. Thus the most elegant theoretical solution to a problem 
may not lead to the most efficient or useful method of solution in practice. While no 
reference is made to particular computer software, the concurrent use of a personal 
computer equipped symbolic-manipulation software such as MAPLE, MATHEMAT- 
ICA, MATLAB, or MUPAD is definitely advised. 

In order to show the “human face” of mathematics, the book also includes a 
large number of thumbnail photographs of researchers who have contributed to the 
development of the material presented in this volume. 


Acknowledgements Most of the first edition this book was written while I was 
a visitor at the University of Iowa in Iowa City and at the University of California 
in Berkeley. I would like to thank both institutions for providing the facilities and, 
more importantly, the mathematical atmosphere which allowed me to concentrate 
on writing. Subsequent, extensively revised editions, were prepared after I retired 
from teaching at the University of Haifa in April, 2004. 

I have talked to many students and faculty members about my plans for this book 
and have obtained valuable insights from them. In particular, I would like to ac- 
knowledge the aid of the following colleagues and students who were kind enough 


For Whom Is This Book Written? ix 


to read the preliminary versions of this book and offer their comments and correc- 
tions: Prof. Daniel Anderson (University of Iowa), Prof. Adi Ben-Israel (Rutgers 
University), Prof. Robert Cacioppo (Truman State University), Prof. Joseph Felsen- 
stein (University of Washington), Prof. Ryan Skip Garibaldi (Emory University), 
Mr. George Kirkup (University of California, Berkeley), Dr. Denis Sevee (John Al- 
bert College), Prof. Earl Taft (Rutgers University), Mr. Gil Vernik (University of 
Haifa). 


Haifa, Israel Jonathan S. Golan 


Contents 


1 Notation and Terminology .......................-. 1 
2. Fields socat oe eG Ag eee ek oe Bee ee a 5 
3 Vector Spaces OveraField.....................24. 21 
4 Algebras Over a Field .....................2.20004 39 
5 Linear Independence and Dimension .................. 57 
6 Linear Transformations ....................-.-.004 89 
7 The Endomorphism Algebra of a Vector Space ............ 113 
8 Representation of Linear Transformations by Matrices........ 133 
9 The Algebra of Square Matrices ..................-.. 147 
10 Systems of Linear Equations ...................... 189 
11 Determinants ................ 2... 00000002 2s 221 
12 Eigenvalues and Eigenvectors ..................-00-. 255 
13 KrylovSubspaces.................. 00002 ee eee 297 
14 TheDualSpace................... 0000000000. 317 
15 Inner Product Spaces.....................20000.4 333 
16 Orthogonality............... 2.20.0... 00000000. 369 
17 Selfadjoint Endomorphisms ....................... 395 
18 Unitary and Normal Endomorphisms ................. 419 
19 Moore-Penrose Pseudoinverses ...... nauhaa aaa 441 
20 Bilinear Transformations and Forms .................. 453 
Appendix A Summary of Notation ..................... 479 
Appendix B Index to Thumbnail Photos .................. 481 
ndë oc ee cerea OGRE hee ee ee eee ed as 489 


Notation and Terminology 


Sets will be denoted by braces, { }, between which we will either list the elements 
of the set or give a rule for determining whether something is an element of the set 
or not,! as in {x | p(x)}, which is read “the set of all x such that p(x)”. If a is an el- 
ement of a set A, we write a € A; if itis not an element of A, we write a ¢ A. When 
one enumerates the elements of a set, the order is not important. Thus {1, 2, 3, 4} 
and {4, 1,3, 2} both denote the same set. However, we often do wish to impose an 
order on sets the elements of which we enumerate. Rather than introduce new and 
cumbersome notation to handle this, we will make the convention that when we enu- 
merate the elements of a finite or countably-infinite set, we will assume an implied 
order, reading from left to right. Thus, the implied order on the set {1, 2, 3, .. .} is in- 
deed the usual one, whereas {4, 1, 3, 2} gives the first four positive integers, ordered 
alphabetically. The empty set, namely the set having no elements, is denoted by Ø. 
Sometimes we will use the word “collection” as a synonym for “set”, generally to 
avoid talking about “sets of sets”. 

A finite or countably-infinite selection of elements of a set A is a list. Members 
of a list are assumed to be in a definite order, given by their indices or by the im- 
plied order of reading from left to right. Lists are usually written without brackets: 
41, .. . , An, though, in certain contexts, it will be more convenient to write them as 
ordered n-tuples (a1, ..., an). Note that the elements of a list need not be distinct: 
3, 1, 4, 1, 5, 9 is a list of six positive integers, the second and fourth elements of 
which are equal to 1. A countably-infinite list of elements of a set A is also often 
called a sequence of elements of A. The set of all distinct members of a list is called 
the underlying subset of the list. 

If A and B are sets, then their union AU B is the set of all elements that belong to 
either A or B, and their intersection AN B is the set of all elements belonging both 
to A and to B. More generally, if {A; | i € 2} is a (possibly-infinite) collection of 


‘Mathematically, these two ways of defining a set are equivalent, but philosophically and func- 
tionally they are not. Listing the elements of a set involves denotation whereas giving a rule for 
determining set membership involves connotation. This distinction becomes important when we 
attempt to use computers to manipulate sets. 
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sets, then |]J ieg Ai is the set of all elements that belong to at least one of the A; and 
N ieg Ai is the set of all elements that belong to all of the A;. If A and B are sets, 
then the difference set A x B is the set of all elements of A which do not belong 
to B. 

A function f from a nonempty set A to a nonempty set B is a rule which assigns 
to each element a of A a unique element f(a) of B. The set A is called the domain 
of the function and the set B is called the range of the function. To denote that f is 
a function from A to B, we write f : A —> B. To denote that an element b of B is 
assigned to an element a of A by f, we write f : at» b. (Note the different form 
of the arrow!) This notation is particularly helpful in the case that the function f is 
defined by a formula. Thus, for example, if f is a function from the set of integers 
to the set of integers defined by f : a > a3, then we know that f assigns to each 
integer its cube. The set of all functions from a nonempty set A to a nonempty set 
B is denoted by B4. If f € B4 and if A’ is a nonempty subset of A, then a function 
f’ € B“ is the restriction of f to A’, and f is the extension of f’ to A, if and only 
if f :a' > f(a’) for alla’ € A’. 

Functions f and g in B4 are equal if and only if f(a) = g(a) for all a € A. 
In this case, we write f = g. A function f € B4 is monic if and only if it assigns 
different elements of B to different elements of A, i.e., if and only if f (a1) # f (a2) 
whenever a; # a2 in A. A function f € B4 is epic if and only if every element 
of B is assigned by f to some element of A. A function which is both monic and 
epic is bijective. A bijective function from a set A to a set B determines a bijective 
correspondence between the elements of A and the elements of B. If f : A— Bisa 
bijective function, then we can define the inverse function f7! : B — A defined by 
the condition that f —!(b) =a if and only if f(a) = b. This inverse function is also 
bijective. A bijective function from a set A to itself is a permutation of A. Note that 
there is always at least one permutation of any nonempty set A, namely the identity 
function a > a. 

The Cartesian product A, x Az of nonempty sets A; and Az is the set of all 
ordered pairs (a1, a2), where a, € A; and az € A2. More generally, if Aj,..., An 
is a list of nonempty sets, then A, x --- x Ay, is the set of all ordered n-tuples 
(a1, ..., an) satisfying the condition that a; € A; for each 1 <i <n. Note that each 
ordered n-tuple (a1, ..., an) uniquely defines a function f : {1,...,n} > Uj, Ai 
given by f :itv a; for each 1 <i <n. Conversely, each function f :{1,..., n} —> 
(J, A; satisfying the condition that f(i) € A; for 1 <i <n defines such an or- 
dered n-tuple, namely (f(1),..., f(n)). This suggests a method for defining the 
Cartesian product of an arbitrary collection of nonempty sets. If {A; | i € 2} is an 
arbitrary collection of nonempty sets, then the set [];-¢ Ai is defined to be the set 
of all those functions f from 2 to U;<¢ Ai satisfying the condition that f (i) € A; 
for each i € §2. The existence of such functions is guaranteed by a fundamental 
axiom of set theory, known as the Axiom of Choice. A certain amount of contro- 
versy surrounds this axiom, since it leads to some very counter-intuitive results. 
Thus, for example, in 1924 Polish mathematicians Stefan Banach and Alfred Tarski 
showed that if the Axiom of Choice is assumed then any solid sphere can be split 
into finitely-many pieces which can be reassembled to form two solid spheres of the 
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same size as the original sphere. Therefore, there are mathematicians who prefer to 
make as little use of the Axiom of Choice as possible. In 1963, American mathe- 
matician P. J. Cohen showed that the Axiom of Choice is independent of the other 
axioms of Zermelo—Fraenkel set theory, and so one is—in principle—free to either 
assume it or its negation. Since we will need this axiom constantly throughout this 
book, we will always assume that it holds. 

In the foregoing construction, we did not assume that the sets A; were necessarily 
distinct. Indeed, it may very well happen that there exists a set A such that Aj = A 
for all i € 2. In that case, we see that ice Aj is just A®. If the set 2 is finite, 
say 2 = {1,...,n}, then we write A” instead of A? . Thus, A” is just the set of all 
ordered n-tuples (a1, .. ., an) of elements of A. 


Example The function fz : N? — N given by 
es 1 2 rs g n : 
h:i, j) 5 (i +j +i+2ij+3j) 
is bijective. For k > 2 we can define a bijective function fg : N‘ + N inductively 
by 
fk: (iii) I folii, fe-1 (i2, ---, ix). 


We use the following standard notation for some common sets of numbers: 


N the set of all nonnegative integers, 
Z the set of all integers, 

Q the set of all rational numbers, 

R the set of all real numbers, 

C the set of all complex numbers. 


Other notion is introduced throughout the text, as is appropriate. See the Summary 
of Notation in Appendix A of the book. 


Fields 


The way of mathematical thought is twofold: the mathematician first proceeds in- 
ductively from the particular to the general and then deductively from the general 
to the particular. Moreover, throughout its development, mathematics has shown 
two aspects—the conceptual and the computational—the symphonic interleaving of 
which forms one of the major aspects of the subject’s aesthetic. 

Let us therefore begin with the first mathematical structure—numbers. By the 
Hellenistic times, mathematicians distinguished between two types of numbers: the 
rational numbers, namely those which could be written in the form A for some in- 
teger m and some positive integer n, and those numbers representing the geometric 
magnitude of segments of the line, which today we call real numbers and which, in 
decimal notation, are written in the form m.kıkzk3 . .. where m is an integer and the 
ki are digits. The fact that the set Q of rational numbers is not equal to the set R of 
real numbers was already noticed by the followers of the early Greek mathemati- 
cian/mystic Pythagoras. On both sets of numbers we define operations of addition 
and multiplication which satisfy certain rules of manipulation. Isolating these rules 
as part of a formal system was a task first taken on in earnest by nineteenth-century 
British and German mathematicians. From their studies evolved the notion of a field, 
which will be basic to our considerations. However, since fields are not our primary 
object of study, we will delve only minimally into this fascinating notion. A seri- 
ous consideration of field theory must be deferred to an advanced course in abstract 
algebra. 

A nonempty set F together with two functions F x F — F, respectively called 
addition (as usual, denoted by +) and multiplication (as usual, denoted by - or by 
concatenation), is a field if the following conditions are satisfied: 

(1) (associativity of addition and multiplication): a + (b + c) = (a+ b) + c and 
a(bc) = (ab)c for all a,b,c € F. 

(2) (commutativity of addition and multiplication): a + b = b + a and ab = ba for 
alla, be F. 

(3) (distributivity of multiplication over addition): a(b + c) = ab + ac for all 
a,b,ceF. 
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(4) (existence of identity elements for addition and multiplication): There exist dis- 
tinct elements of F, which we will denote by O and 1 respectively, satisfying 
a+0=a andal =a foralla € F. 

(5) (existence of additive inverses): For each a € F there exists an element of F, 
which we will denote by —a, satisfying a + (—a) = 0. 

(6) (existence of multiplicative inverses): For each 0 Æa € F there exists an ele- 
ment of F, which we will denote by a 


=] 


1 


, Satisfying a a = 1. 


a 


With kind permission of the Archives of the Mathematisches Forschungsinstitut Oberwolfach (Weber, 
Dedekind, Kronecker and Steinitz). 


The development of the abstract theory of fields is generally credited to the nineteenth- 
century German mathematician Heinrich Weber, based on earlier work by the German 
mathematicians Richard Dedekind and Leopold Kronecker. Another nineteenth-century 
mathematician, the British Augustus De Morgan, was among the first—along with French 
mathematician François Joseph Servois—to isolate the importance of such properties as 
associativity, distributivity, and so forth. The final axioms of a field are due to the twentieth- 
century German mathematician Ernst Steinitz. 


Note that we did not assume that the elements —a and a~! are unique, though 
we will soon prove that in fact they are. If a and b are elements of a field F, we will 
follow the usual conventions by writing a — b instead of a + (—b) and ¢ instead 
of ab~!. Moreover, if 04 a € F and if n is a positive integer, then na denotes the 
sum a +----+ a (n summands) and a” denotes the product a---a (n factors). If n 
is a negative integer, then na denotes (—n)(—a) and a” denotes (a~!)~". Finally, 
if n = 0 then na denotes the field element 0 and a” denotes the field element 1. For 
0=ae F, we define na = 0 for all integers n and a” = 0 for all positive integers n. 
The symbol 0* is not defined for k < 0. 

As an immediate consequence of the associativity and commutativity of addition, 
we see that the sum of any list a1, ..., an of elements of a field F is the same, no mat- 
ter in which order we add them. We can therefore unambiguously write aj +-+- +an. 
This sum is also often denoted by }~"_, a;. Similarly, the product of these elements 
is the same, no matter in which order we multiply them. We can therefore unam- 
biguously write a; ---a,. This product is also often denoted by []/_, a;. Also, a 
simple inductive argument shows that multiplication distributes over arbitrary sums: 
ifa € F and bı, ... , bn isa list of elements of F then a()~/_, bi) = } ;—; abi. 

We easily see that Q and R, with the usual addition and multiplication, are fields. 

A subset G of a field F is a subfield if and only if it contains 0 and 1, is closed 
under addition and multiplication, and contains the additive and multiplicative in- 
verses of all of its nonzero elements. Thus, for example, Q is a subfield of R. It is 
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easy to verify! that the intersection of a collection of subfields of a field F is again 
a subfield of F. 
We now want to look at several additional important examples of fields. 


Example Let C = R? and define operations of addition and multiplication on C by 
setting (a,b) + (c,d) = (a + c,b + d) and (a,b) - (c,d) = (ac — bd, ad + be). 
These operations define the structure of a field on C, in which the identity element 
for addition is (0,0), the identity element for multiplication is (1, 0), the additive 
inverse of (a, b) is (—a, —b), and 


ba ge ee 
7 _ a2 +b?’ a? + b2 


for all (0,0) 4 (a, b). This field is called the field of complex numbers. The set 
of all elements of C of the form (a, 0) forms a subfield of C, which we normally 
identify with R and therefore it is standard to consider R as a subfield of C. In 
particular, we write a instead of (a, 0) for any real number a. The element (0, 1) of 
C is denoted by i. This element satisfies the condition that i 2 — (—1, 0) and so it is 
often written as /—1. We also note that any element (a, b) of C can be written as 
(a,0) + b(0, 1) =a + bi, and, indeed, that is the way complex numbers are usually 
written and how we will denote them from now on. If z =a + bi, then a is the real 
part of z, which is often denoted by Re(z), while bi is the imaginary part of z, which 
is often denoted by Im(z). The field of complex numbers is extremely important in 
mathematics. From a geometric point of view, if we identify R with the set of points 
on the Euclidean line, as one does in analytic geometry, then it is natural to identify 
C with the set of points in the Euclidean plane. 


iy 


With kind permission of the Harvard Arts Museum (Descartes); With kind permission of ETH-Bibliothek 
Zurich, Image Archive (Euler); With kind permission of Bibliothèque nationale de France (Argand). 


The term “imaginary” was coined by the seventeenth-century French philosopher and math- 
ematician René Descartes. The use of i to denote /—1 was introduced by the eighteenth- 
century Swiss mathematician Leonhard Euler. The geometric representation of the com- 
plex numbers was first proposed at the end of the eighteenth century by the Norwegian 
surveyor Caspar Wessel, and later by the French accountant Jean-Robert Argand. It was 
studied in detail by the nineteenth-century Italian mathematician Giusto Bellavitis. 


'When a mathematician says that something is “easy to see” or “trivial”, it means that you are 
expected to take out a pencil and paper and spend some time—often considerable—checking it out 
by yourself. 
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If z =a + bi € C then we denote the complex number a — bi, called the complex 
conjugate of z, by Z z. It is easy to see that for all z, z’ € C we have z +z’ =Z + z'e, 


=z = —7, z =z z, z! = = (Z) 1 , and Z Z= z. The number zz equals a 24 2, which 
is a nonnegative real number and so has a square root in R, which we will denote 
by |z|. Note that |z| is nonzero whenever z 4 0. From a geometric point of view, 
this number is just the distance from the number z, considered as a point in the 
Euclidean plane, to the origin, just as the usual absolute value |a| of a real number 
a is the distance between a and 0 on the real line. It is easy to see that if y and z are 
complex numbers then |yz| = |y|-|z| and |y+z| < |y| + |z|. Moreover, if z = a + bi 


then 
Z+Z=2a <2la| = 2V a? < 2Va* +b? = 2|z]. 


We also note, as a direct consequence of the definition, that |z| = |z| for every com- 
plex number z and so z~! = |z|~*z for all 0 Æ z € C. In particular, if |z| = 1 then 


Z SZ. 


Example The set Q? is a subfield of the field C defined above. However, it is also 
possible to define field structures on Q? in other ways. Indeed, let F = Q? and 
let p be a fixed prime integer. Define addition and multiplication on F by setting 
(a,b) + (c,d) = (a + c, b + d) and (a, b)- (c,d) = (ac + bdp, ad + bc). 

Again, one can check that F is indeed a field and that, again, the set of all ele- 
ments of F of the form (a, 0) is a subfield, which we will identify with Q. More- 
over, the additive inverse of (a, b) € F is (—a, —b) and the multiplicative inverse of 


(0, 0) Æ (a,b) € F is 
a —b 
a? — pb?’ a? — pb? ` 


(We note that a? — pb? is the product of the nonzero real numbers a + b./p anda — 
b,/p and so is nonzero.) The element (0, 1) of F satisfies (0, 1)? = (p, 0) and so one 
usually denotes it by ./p and, as before, any element of F can be written in the form 
a+b,/p, where a, b € Q. The field F is usually denoted by Q(,/p). Since there are 
infinitely-many distinct prime integers, we see that there are infinitely-many ways 
of defining different field structures on Q x Q, all having the same addition. 


Example Fields do not have to be infinite. Let p be a positive integer and let 
Z/(p) = {0,1,..., p — 1}. For each nonnegative integer n, let us, for the pur- 
poses of this example, denote the remainder after dividing n by p as [n],. Thus 
we note that [n], € Z/(p) for each nonnegative integer n and that [i], =i for all 
i € Z/(p). We now define operations on Z/(p) by setting [n]p + [k]p = [n +k] p 
and [n], - [k] = [nk] p. It is easy to check that if the integer p is prime then Z/(p), 
together with these two operations, is again a field, known as the Galois field of 
order p. This field is usually denoted by GF(p). While Galois fields were first con- 
sidered mathematical curiosities, they have since found important applications in 
coding theory, cryptography, and modeling of computer processes. 
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These are not the only possible finite fields. Indeed, it is possible to show that for 
each prime integer p and each positive integer n there exists an (essentially unique) 
field with p” elements, usually denoted by GF(p”). 


With kind permission of Bibliothéque nationale de France 
(Galois); With kind permission of the American Mathemat- 
ical Society (Moore). 

The nineteenth-century French mathematical ge- 
nius Evariste Galois, who died at the age of 21, 
was the first to consider such structures. The study 
of finite and infinite fields was unified in the 1890s 
by Eliakim Hastings Moore, the first American- 
born mathematician to achieve an international 
reputation. 


Example Some important structures are “very nearly” fields. For example, let 
Roo = RU {oo}, and define operations H and L1] on Ræ by setting 


min{a,b} ifa,beR, 
aHb=%b ifa = œ, 
a if b = œ, 


and 


ea ifa,beR, 
alb = ; 
(ove) otherwise. 


This structure, called the optimization algebra, satisfies all of the conditions of a 
field except for the existence of additive inverses (such structures are known as semi- 
fields). As the name suggests, it has important applications in optimization theory 
and the analysis of discrete-event dynamical systems. There are several other semi- 
fields which have significant applications and which have been extensively studied. 


Another possibility of generalizing the notion of a field is to consider an algebraic 
structure which satisfies all of the conditions of a field except for the existence of 
multiplicative inverses, and to replace that condition by the condition that if a, b 4 0 
then ab Æ 0. Such structures are known as integral domains. The set Z of all integers 
is the simplest example of an integral domain which is not a field. Algebras of 
polynomials over a field, which we will consider later, are also integral domains. In 
a course in abstract algebra, one proves that any integral domain can be embedded 
in a field. 

In the field GF(p) which we defined above, one can easily see that the sum 
1+---+ 1 (p summands) equals 0. On the other hand, in the field Q, the sum of 
any number of copies of 1 is always nonzero. This is an important distinction which 
we will need to take into account in dealing with structures over fields. We therefore 
define the characteristic of a field F to be equal to the smallest positive integer p 
such that 1 +- - -+ 1 (p summands) equals 0—if such an integer p exists—and to be 
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equal to 0 otherwise. We will not delve deeply into this concept, which is dealt with 
in courses on field theory, except to note that the characteristic of a field, if nonzero, 
always turns out to be a prime number, as we shall prove below. 

In the definition of a field, we posited the existence of distinct identity elements 
for addition and multiplication, but did not claim that these elements were unique. 
It is, however, very easy to prove that fact. 


Proposition 2.1 Let F be a field. 
(1) Ife is an element of F satisfying e +a =a for alla € F thene=0; 
(2) [fu is an element of F satisfying ua =a foralla € F thenu=1. 


Proof By definition, e = e + 0 = 0 and u = ul = 1. 


Similarly, we prove that additive and multiplicative inverses, when they exist, are 
unique. Indeed, we can prove a stronger result. 


Proposition 2.2 Ifa and b are elements of a field F then: 
(1) There exists a unique element c of F satisfying a + c =b. 
(2) Ifa £0 then there exists a unique element d of F satisfying ad = b. 


Proof (1) Choose c = b — a. Then 


a+c=a+(b—a)=a+[b+ (—a)] 
=a+[(—a)+b]=[a+ (—a)]+b=0+b=b. 


Moreover, if a + x = b then 


x=0+x=[(—a)+a]+x 
= (~a) + (a@+x)=(-a)+b=b-—a, 
proving uniqueness. 


(2) Choose d = a7 !b. Then ad = a(a~!b) = (aa~!)b = 1b = b. Moreover, if 
ay =b then y = ly = (a~!a)y =a! (ay) = a7 tb, proving uniqueness. 


We now summarize some of the elementary properties of fields, which are all we 
will need for our discussion. 
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Proposition 2.3 If a,b, and c are elements of a field F then: 
(1) 0a =0; 


(2) (-lha=~—a; 
(3) a(—b) = —(ab) = (—a)b; 
(4) —(-a) =a; 


(5) (—a)(—b) = ab; 

(6) —(a+ b) = (~a) + (—b); 

(7) a(b — c) = ab — ac; 

(8) Ifa £0 then (a7)! =a; 

(9) Ifa, b £0 then (ab)™! =b~'a7!; 
(10) Ifa+c=b+c thena =b; 

(11) Ifc #0 and ac = bc then a = b; 
(12) fab =0 thena =b orb=0. 


Proof (1) Since 0a + 0a = (0 + 0)a = Oa, we can add — (0a) to both sides of the 
equation to obtain 0a = 0. 

(2) Since (—1)a + a = (—1)a + la = [(—1) + lJa = 0a = 0 and also (—a) + 
a = 0, we see from Proposition 2.2 that (—l)a = —a. 

(3) By (2) we have a(—b) = a[(—1)b] = (—l)ab = —(ab) and similarly 
(—a)b = — (ab). 

(4) Since a + (—a) = 0 = —(—a) + (—a), this follows from Proposition 2.2. 

(5) From (3) and (4) it follows that (—a)(—b) = a[—(—b)] = ab. 

(6) Since (a + b) + [(—a) + (—b)] =a +b + (—a) + (—b) = 0 and (a + b) + 
[—(a + b)] = 0, the result follows from Proposition 2.2. 

(7) By (3) we have a(b — c) = ab + a(—c) = ab + [—(ac)] = ab — ac. 

(8) Since (a~!)~!a~! = 1 = aa™!, this follows from Proposition 2.2. 

(9) Since (a~!b~!)(ba) = a~!ab~'b = 1 = (ab)™! (ba), the result follows from 
Proposition 2.2. 

(10) This is an immediate consequence of adding —c to both sides of the equa- 
tion. 

(d T This is an immediate consequence of multiplying both sides of the equation 
byc. 

(12) If b = 0 we are done. If b Æ 0 then by (1) it follows that multiplying both 
sides of the equation by b~! will yield a = 0. 


The following two propositions are immediate consequences of Proposition 2.3. 


Proposition 2.4 Let a be a nonzero element of a finite field F having q ele- 
ments. Then a~! = a4~?. 


Proof If q = 2 then F = GF(2) and a = 1, so the result is immediate. Hence we 
can assume q > 2. Let B = {a,...,dg—1} be the nonzero elements of F, writ- 
ten in some arbitrary order. Then aa; Æ aap for i #h since, were they equal, 
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we would have a; = a`! (aai) = a`! (aan) = an. Therefore B = {aa;, ... , aaq—1} 
and so me ai = Its (aai) = a4 TE ai]. Moreover, this is a product of 
nonzero elements of F and so, by Proposition 2.3(12), is also nonzero. Therefore, 
by Proposition 2.3(11), 1 = a171, and so aa~! = 1 = a17! = a(a17?), implying 
that a7! = a17?. 


Proposition 2.5 Jf F is a field having characteristic p > 0, then p is prime. 


Proof Assume that p is not prime. Then p = hk, where 0 < h,k < p. Therefore, 
a = hlp and b = k1p are nonzero elements of F. But ab = (hk)l p = plp =0, 
contradicting Proposition 2.3(12). 


Of course, one can use Proposition 2.3 to prove many other identities among 
elements of a field. A typical example is the following 


Proposition 2.6 (Hua’s identity) Zf a and b are nonzero elements of a field 
F satisfying a4 b7! then 


a — aba = (a7! + [o~ — ie ae 


Proof We note that 


a7! + (bp — a) | = a_'[(b"! —a) +a](b"! — a) | 


= a 'b (bp! — a), 


so (a7! + [b7! — a] 71)! = (b7! — a)ba =a — aba. 


Loo-Keng Hua was a major twentieth-century Chinese mathemati- 
cian. 
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Exercises 


Exercise 1 

Let F be a field and let G = F x F. Define operations of addition and multiplica- 
tion on G by setting (a, b) + (c, d) = (a +c, b+ d) and (a, b)- (c, d) = (ac, bd). 
Do these operations define the structure of a field on G? 


Exercise 2 
Let K be the set of the following four-tuples of elements of GF(3): 


(0, 0, 0, 0), (1, 2, 1, 1), 2, 1,2, 2), (1, 0, 0, 1), (2, 2, 1, 2), 
(2,0, 0, 2), (0, 1, 2, 0), (0, 2, 1,0), (1, 1,2, 1). 


Define operations of addition and multiplication on K so that it becomes a field. 


Exercise 3 

Let r € R and let 0 Æ s € R. Define operations H and L] on R x R by (a, b) 
(c,d) = (a+c, b+ d) and (a, b) © (c, d) = (ac — bd (r? +s”), ad + bc + 2rbd). 
Do these operations, considered as addition and multiplication, respectively, de- 
fine the structure of a field on R x R? 


Exercise 4 

Define a new operation + on R by setting a + b = a*b. Show that R, on which 
we have the usual addition and this new operation as multiplication, satisfies all 
of the axioms of a field with the exception of one. 


Exercise 5 

Let 1 <t € R and let F = {a € R | a < 1}. Define operations @ and © on F as 
follows: 

(1) a@b=a+b—-ab foralla,be F; 

(2) aOb = 1 — 108 0-0) log, (0-b) for alla, b € F. 

For which values of t does F, together with these operations, form a field? 


Exercise 6 
Show that the set of all real numbers of the form a + b/2+cJV34+d V6, where 
a,b,c,d € Q, forms a subfield of R. 


Exercise 7 
Is {a+ bV15 | a,b € Q} a subfield of R? 


Exercise 8 
Show that the field R has infinitely-many distinct subfields. 


Exercise 9 
Let F be a field and define a new operation « on F by setting a*b=a+b-+ab. 
When is (F, +, *) a field? 
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Exercise 10 

Let F be a field and let G, be the subset of F consisting of all elements which 
can be written as a sum of n squares of elements of F. 

(1) Is the product of two elements of G2 again an element of G2? 

(2) Is the product of two elements of G4 again an element of G4? 


Exercise 11 
Let t = 4/2 € R and let S be the set of all real numbers of the form a + bt + ct?, 
where a, b, c € Q. Is S a subfield of R? 


Exercise 12 
Let F be a field. Show that the function a > a™! is a permutation of F \ {0F}. 


Exercise 13 
Show that every z € C satisfies 


w44=(2-1-i(z-it+ie@+1+ie+1-i). 


Exercise 14 

In each of the following, find the set of all complex numbers z = a + bi satisfying 
the given relation. Note that this set may be empty or may be all of C. Justify your 
result in each case. 

(a) z? = 3(1 +iv3); 

(b) (V2)|z| = lal + Ibl; 

(c) [zg] +z=2+i; 

(d) 24 =2-(V12)i; 

(e) zt = —4. 


Exercise 15 
Let y be a complex number satisfying |y| < 1. Find the set of all complex num- 
bers z satisfying |z — y| < |1 — yz]. 


Exercise 16 
Let z1, z2, and z3 be complex numbers satisfying the condition that |z;| = 1 for 
i = 1,2, 3. Show that |z1z2 + 2123 + 2223| = |z1 + z2 + 23]. 


Exercise 17 
For any z1, z2 € C, show that |z1|? + |z2|? — 2172 — 7122 = |z1 — z21°. 


Exercise 18 
Show that |z + 1| < |z + 1|? + Iz] for all z € C. 


Exercise 19 
If z € C, find w € C satisfying w? =z. 
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Exercise 20 
Define new operations o and ¢ on C by setting y oz = |y|z and 


_ J0 if y=0, 
eo pz otherwise 


for all y, z € C. Is it true that w © (y o z) = (w © y) o (w è z) and w o (y © z) = 
(wo y) o (w oz) for all w, y, z € C? 


Exercise 21 
Let 0 4 z € C. Show that there are infinitely-many complex numbers y satisfying 
the condition yy = zZ. 


Exercise 22 
(Abel’s inequality) Let z1, ..., Zn be a list of complex numbers and, for each 
1 <k<n, let sg = Da zi. For real numbers a1, ..., a, satisfying a, > a > 


+++ > an > 0, show that | $ `}—] aizi| < a1 (max1<k<n |Sk]). 


With kind permission of the Archives of the Mathematisches Forschungsinstitut Ober- 
wolfach. 


The nineteenth-century Norwegian mathematical genius Niels Henrik 
Abel died tragically at the age of 26. 


Exercise 23 
Let 0 Æ zo € C satisfy the condition |zo| < 2. Show that there are precisely two 


complex numbers, zı and z2, satisfying |z;| + |z2| = 1 and z1 + z2 = Zo. 


Exercise 24 
If p is a prime positive integer, find all subfields of GF(p). 


Exercise 25 
Find 107! in GF(33). 


Exercise 26 
Find elements c, d # +1 in the field Q(/5) satisfying cd = 19. 
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Exercise 27 
Let F be the set of all real numbers of the form 


a+b(V5) +¢(¥5)°, 
where a, b, c € Q. Is F a subfield of R? 


Exercise 28 
Let p be a prime positive integer and let a € GF(p). Does there necessarily exist 
an element b of GF(p) satisfying b? = a? 


Exercise 29 

Let F = GF(11) and let G = F x F. Define operations of addition and multi- 
plication on G by setting (a, b) + (c,d) = (a + c,b + d) and (a,b)- (c,d) = 
(ac + 7bd,ad + bc). Do these operations define the structure of a field on G? 


Exercise 30 

Let F be a field and let G be a finite subset of F \ {0} containing 1 and satisfying 
the condition that if a,b € F then ab~! € G. Show that there exists an element 
c e G such that G = {c! | i > 0}. 


Exercise 31 
Let F be a field satisfying the condition that the function a œ> a 
of F. What is the characteristic of F? 


2 is a permutation 


Exercise 32 
Is Z/(6) an integral domain? 


Exercise 33 
Let F = {a + bV5 € Q(V5) | a, b € Z}. Is F an integral domain? 


Exercise 34 
Let F be an integral domain and let a € F satisfy a? = a. Show that a = 0 or 
a=l. 


Exercise 35 
Let a be a nonzero element in an integral domain F. If b Æ c are distinct elements 
of F, show that ab + ac. 


Exercise 36 

Let F be an integral domain and let G be a nonempty subset of F containing 0 
and 1 and closed under the operations of addition and multiplication in F. Is G 
necessarily an integral domain? 


Exercise 37 
Let U be the set of all positive integers and let F be the set of all functions 
from U to C. Define operations of addition and multiplication on F by setting 
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fte:kh f(k)+ g(k) and fg: kph Dij FOG) for all k e U. Is F, to- 
gether with these operations, an integral domain? Is it a field? 


Exercise 38 

Let F be the set of all functions f from R to itself of the form f : th 
ies lax cos(kt) + by sin(kt)], where the ag, and bx are real numbers and n 
is some positive integer. Define addition and multiplication on F by setting 
fte:tre fOH+ef and fg:trh f(t)g(t) for all t € R. Is F, together 
with these operations, an integral domain? Is it a field? 


Exercise 39 
Show that every integral domain having only finitely-many elements is a field. 


Exercise 40 

Let F be a field of characteristic other than 2 in which there exist elements 
al, ...,an Satisfying >", a? = —1. (This happens, for example, in the case 
F= t) Show that for any c € F there exist elements b1, ..., bg of F satisfying 
c= Vint b7- 


Exercise 41 
Let p be a prime integer. Show that for each a e GF(p) there exist elements b 
and c of GF(p), not necessarily distinct, satisfying a = b? + c?. 


Exercise 42 
Let F be a field in which we have elements a, b, and c (not necessarily distinct) 
satisfying a? + b? + c? = —1. Show that there exist (not necessarily distinct) 


elements d and e of F, satisfying d? + e? = —1. 


Exercise 43 
Is every nonzero element of the field GF(5) in the form 2' for some positive 
integer i? What happens in the case of the field GF(7)? 


Exercise 44 
Find the set of all fields F in which there exists an element a satisfying the 
condition that a + b =a for all b € F ~ {a}. 


Exercise 45 

(Binomial formula) If a and b are elements of a field F, and if n is a positive 
integer, show that (a + b)” = Y} _o ({)akb"*. 

Exercise 46 


Let F be a field of characteristic p > 0. Show that the function y : F > F 
defined by y : at» a? is monic. 
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Exercise 47 

Let a and b be nonzero elements of a finite field F, and let m and n be positive 
integers satisfying a” = b” = 1. Show that there exists a nonzero element c of F 
satisfying ck = 1, where k is the least common multiple of m and n. 


Exercise 48 
If a is a nonzero element of a field F, show that (—a)~! = —(a7!). 


Exercise 49 

Let F = GF(7) and let K = F x F. Define addition and multiplication on K by 
setting (a,b) + (c,d) = (a+ b, c + d) and (a, b) - (c,d) = (ac — bd, ad + be). 
Do these operations turn K into a field? What happens if F = GF(5)? 


Exercise 50 

A field F is orderable if and only if there exists a subset P closed under addition 
and multiplication such that for each a € F precisely one of the following condi- 
tions holds: (i) a = 0; (ii) a € P; (iii) —a € P. Show that GF(5) is not orderable. 


Exercise 51 

Let F be a field and let K be the set of all functions f € F? satisfying the 
condition that there exists an integer (perhaps negative) ny such that f(i) = 0 
for all i < n f. Define operations of addition and multiplication on K by setting 
ft+g:iP f@O+e@ and fg :i > pas ee f VU) g(h). Show that K is a field, 
called the field of formal Laurent series over F ? 


Exercise 52 
Let F be a field. Find A = {(x, y) € F? | x? +y? = 1}. 


Exercise 53 
Let F be a field having characteristic p > 0 and let c € F. Show that there is at 
most one element b of F satisfying b? =c. 


Exercise 54 

A ternary ring is a set R containing distinguished elements 0 and 1, together 

with a function 6 : R? —> R satisfying the following conditions: 

(1) 0(1,a,0)=08(a,1,0)=a foralla € R; 

(2) 0(a,0,c)=0(0,a,c)=c forall c € R; 

(3) If a,b,c € R then there is a unique element y of R satisfying 6 (a, b, y) = c; 

(4) Ifa,a',b,b' € R witha #a' then there is a unique element x of R satisfying 
(x,a, b) =0(x,a’,b’); 

(5) Ifa,a',b,b' € R with a 4’ then there are unique elements x and y of R 
satisfying 6 (a, x, y) = b and 0 (a', x, y) =D’. 


These series were first studied by the nineteenth-century French engineer and mathematician, 
Pierre Alphonse Laurent. 
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Such structures have applications in projective geometry. If F is a field, show that 
we can define a function 8 : F? — F in sucha way that F becomes a tertiary ring 
(with 0 and 1 being the neutral elements of the field). 


Exercise 55 

For h = 1,2, 3, let zn = ap + bpi be a complex number satisfying |z| = 1. As- 
sume, moreover, that Ei zi = 0. Show that the points (ap, bh) are the vertices 
of an equilateral triangle in the Euclidean plane. 


Vector Spaces Over a Field 


If n > 1 is an integer and if F is a field, it is natural to define addition on the set F” 
componentwise: 


(ai, ---, an) + (b1, ..., bn) = (a1 + b1, ..., an + bn). 


More generally, if 2 is any nonempty set and if FÙ is the set of all functions from 
2 to the field F, we can define addition on F® by setting f + g:i f(i)+ g(i) 
for each i € §2. Given these definitions, is it possible to define multiplication in such 
a manner that F” or F? will become a field naturally containing F as a subfield? 
We have seen that if n = 2 and if F = R or F = Q, this is possible—and, indeed, in 
the latter case there are several different methods of doing it. If F = GF(p) then it 
is possible to define such a field structure on F” for every integer n > 1. However, 
in general the answer is negative—as we will show in a later chapter for the specific 
case of R*, where k > 2 is an odd integer. Nonetheless, it is possible to construct 
another important and useful structure on these sets, and this structure will be the fo- 
cus of our attention for the rest of this book. We will first give the formal definition, 
and then look at a large number of examples. 

Let F be a field. A nonempty set V, together with a function V x V — V called 
vector addition (denoted, as usual, by +) and a function F x V —> V called scalar 
multiplication (denoted, as a rule, by concatenation) is a vector space over F if the 
following conditions are satisfied: 

(1) (associativity of vector addition): v + (w + y) = (v+ w) + y forall v, w, y € V. 

(2) (commutativity of vector addition): v + w = w + v for all v, w € V. 

(3) (existence of a identity element for vector addition): There exists an element Oy 
of V satisfying the condition that v + Oy = v forall v e V. 

(4) (existence of additive inverses): For each v € V there exists an element of V, 
which we will denote by —v, which satisfies v + (—v) = Oy. 

(5) (distributivity of scalar multiplication over vector addition and of scalar multi- 
plication over field addition): a(v + w) = av + aw and (a + b)v = av + bv for 
alla,be F andv,weV. 


J.S. Golan, The Linear Algebra a Beginning Graduate Student Ought to Know, 21 
DOI 10.1007/978-94-007-2636-9_3, © Springer Science+Business Media B.V. 2012 
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(6) (associativity of scalar multiplication): (ab)v = a(bv) for all a,b € F and 
veV. 

(7) (existence of identity element for scalar multiplication): 1v = v for all v € V. 

The elements of V are called vectors and the elements of F are called scalars. 


With kind permission of the Manuscripts & Archives, Yale University (Gibbs); © the estate of Oliver Heaviside. 
Reproduced with kind permission of Alan Heather (Heaviside); With kind permission of Special collections, 
Fine Arts Library, Harvard University (Maxwell). 


The theory of vector spaces was developed in the 1880s by the American engineer and 
physicist, Josiah Willard Gibbs and the British engineer Oliver Heaviside, based on the 
work of the Scottish physicist James Clerk Maxwell, the German high-school teacher 
Herman Grassmann, and the French engineer Jean Claude Saint-Venant. 


Example Note that condition (7), apparently trivial, does not follow from the other 
conditions. Indeed, if we take V = F but define scalar multiplication by av = Oy 
for all a € F and v € V, we would get a structure which satisfies conditions (1)—(6) 
but not condition (7). 


If v, w € V we again write v — w instead of v + (—w). As we noted when we 
talked about fields, if v1, ..., Vn is a list of vectors in a vector space V over a field F, 
the associativity of vector addition allows us to unambiguously write vj +-+- + Vn, 
and this sum is often denoted by }“"_, vj. Moreover, if a € F is a scalar then we 
surely have a()-j_; vi) = ()-7_, avi). Similarly, if a1, ..., an is a list of scalars and 
if v € V, then we have ()°7_, a;)v = )-;_, ajv. We will also adopt the convention 
that the sum of an empty set of vectors is equal to Oy. 

Clearly, any field F is a vector space over itself, where we take the vector addition 
to be the addition in F and scalar multiplication to be the multiplication in F. 

We also note an extremely important construction. Let F be a field and let 2 be 
a nonempty set. Assume that, for each i € 92, we are given a vector space V; over F, 
the addition in which we will denote by +; (the vector spaces V; need not, however, 
be distinct from one another). Recall that Į [; < Vi is the set of all those functions f 
from 2 to Ujeg Vi which satisfy the condition that f(i) € V; for each i € 2. We 
now define the structure of a vector space on [|;-¢ V; as follows: if f, g € [jeg Vi 
then f + g is the function in [];<9 Vi given by f + g:i f(i) +i g(i) for each 
i € 2. Moreover, if a € F and f € [[;co Vi, then af is the function in [[;<¢9 Vi 
given by af : i |> a[f(i)] for each i € Q2. It is routine to verify that all of the 
axioms of a vector space are satisfied in this case. For example, the identity element 
for vector addition is just the function in [];<¢ V; given by i > Oy, for eachi € 2. 
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This vector space is called the direct product of the vector spaces V; over F. If 
the set 2 is finite, say 2 = {1,...,n}, then we often write V; x --- x V, instead 
of [<q Vi- If all of the vector spaces V; are equal to the same vector space V, 
then we write V® instead of Tice V; and if 2 = {1,...,n} we write V” instead 
of V? . Note that a function f from a finite set 2 = {1,...,} to a vector space V 
is totally defined by the list f(1), f(2),..., f (n) of its values. Conversely, any list 
V1, ..., Un Of elements of V uniquely defines such a function f given by f :i > vi. 
Therefore, this notation agrees with our previous use of the symbol V” to denote 
sets of n-tuples of elements of V. However, to emphasize the vector space structure 
vi 
here, we will write the elements of V” as columns of the form : |, where the 
Un 
vi are (not necessarily distinct) elements of V. Usually, we will consider the case 
V = F. Vector addition and scalar multiplication in V” are then defined by the rules 
v1 w1 vı + w1 v1 cvi 
+|: |= : andc| : |=] : 
Un Wn Un + Wn Un CUn 
The “classical” study of vector spaces centers around the spaces IR”, the vectors 
in which are identified with the points in n-dimensional Euclidean space. However, 
other vector spaces also have important applications. Vector spaces of the form C” 
are needed for the study of functions of several complex variables. In algebraic 
coding theory, one is interested in spaces of the form F”, where F is a finite field. 
The vectors in this space are words of length n and the field F is the alphabet in 
which these words are written. Thus, one choice for F is the Galois field GF(28), 
the 256 elements of which are identified with the 256 ASCII symbols. 


© National Maritime Museum, Greenwich, London (Gali- 
lei); With kind permission of Frommann—Holzboog Publish- 
ers (Bolzano). 

The first explicit statement of the geometric “par- 
allelogram law” for adding geometric vectors 
was given by the sixteenth-century Pisan scien- 
tist Galileo Galilei. This idea was extended at the 
beginning of the nineteenth century by Bohemian 
priest Bernard Bolzano. 


Let V be a vector space, let k and n be positive integers, and let 2 = {(i, j) | 1 < 

i <k, 1< j <n}. There exists a bijective correspondence between V? and the set 
Vll wes) Vlin 

of all rectangular arrays of the form | : =., | | in which the entries v;j are 


Ukl +++ Ukn 
elements of V. Such an array is called a k x n matrix over V. We will denote the set 
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of all such matrices by Mzy.,(V). Addition in Mkxn(V) is given by 


UIT ... Uln Wil ... Win vi +W ... Vin + Win 


Ukl +++ Ukn Wkl +++ Wkn URL + WL +. Ukn + Wkn 


and scalar multiplication in Mx x(V) is given by 


VIT sss Vin CUIL =». CUin 


Ukl ... Ukn CUk1 «++ CUkn 


The identity element for vector addition in Mxx(V) is the 0-matrix O, all entries 
of which are equal to Oy. Note that V” = M, x1 (V). 


The term “matrix” was first coined by the nineteenth-century British 
mathematician James Joseph Sylvester, one of the major researchers 
in the theory of matrices and determinants. 


If V is a vector space and if 2 = N, then the elements of V? are infinite se- 
quences [vo, v1,...] of elements of V. We will denote this vector space, which we 
will need later, by V°°. Again, the space of particular interest will be F”. 


Example If F is a subfield of a field K, then K is a vector space over F, with 
addition and scalar multiplication just being the corresponding operations in K. 
Thus, in particular, we can think of C as a vector space over R and of R as a vector 
space over Q. 


Example Let A be a nonempty set and let V be the collection of all subsets of A. 
Let us define addition of elements of V as follows: if B and C are elements of V 
then B+ C = (B U C) N (BNC). This operation is usually called the symmetric 
difference of B and C. This definition turns V into a vector space over GF(2), where 
scalar multiplication is defined by OB = Ø and 1B = B for all B e V. This is ac- 
tually just a special case of what we have seen before. Indeed, we note that there 
is a bijective function from V to GF(2)4 which assigns to each subset B of A its 
characteristic function, namely the function xg defined by 


1 ifaeB, 


:a 1| ‘ 
XB | 0 otherwise, 


and it is easy to see that x4 + XB = xA+B, While XA XB = XANB- 


3 Vector Spaces Over a Field 25 


Proposition 3.1 Let V be a vector space over a field F. 
(1) IfzeéV satisfies z + v = v for all v € V thenz=0y. 
(2) Ifv, w € V then there exists a unique element y € V satisfying v+ y = w. 


Proof The proof is similar to the proofs of Proposition 2.1(1) and Proposi- 
tion 2.2(1). 


Proposition 3.2 Let V be a vector space over a field F. If v, w € V and if 
a € F, then: 


(1) a0y =0y; 

(2) Ov=0y; 

(3) (—1)v = —v; 

(4) (~a)v = — (av) =a (~v); 
(5) —(-v) =v; 


(6) av = (—a)(—v); 

(T) —(v + w) = —v — w; 

(8) a(v — w) = av — aw; 

(9) Ifav =0y then either v = Oy ora =0. 


Proof The proof is similar to the proof of Proposition 2.3. 


Let V be a vector space over a field F. A nonempty subset W of V is a subspace 
of V if and only if it is a vector space in its own right with respect to the addition and 
scalar multiplication defined on V. Thus, any vector space V is a subspace of itself, 
called the improper subspace; any other subspace is proper. Also, {Ov} is surely a 
subspace of V, called the trivial subspace; any other subspace is nontrivial. 
Note that the two conditions for a nonempty subset of a vector space to be a 
subspace are independent: the set of all vectors in R? all entries of which are integers 
is closed under vector addition but not under scalar multiplication; the set of all 
a 

vectors | b | € R? satisfying abc = 0 is closed under scalar multiplication but not 
c 

under vector addition. 


Example Let V be a vector space over a field F and let 2 be a nonempty set. We 
have already seen that the set V? of all functions from @ to V is a vector space 
over F. If A is a subset of 2 then the set {f € V? | f(i) = Oy for alli € A} is a 
subspace of V? . In particular, if k < n are positive integers, then we can think of 
VĚ as being a subspace of V”, by identifying it with 
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vi 
EV" | vk} = =U, = Oy $. Note that if y € V, then {f € V? | 
Vn 


f (i) = y for all i € A} is not a subspace of V? unless y = Oy. 


Example Let {V; |i € 2} be a collection of vector spaces over a field F. The set 
of all functions f € Į |;<ọ Vi satisfying the condition that f (i) # Oy, for at most 
finitely-many elements i of 2 is a subspace of [ [;< o Vi, called the direct coproduct 
of the spaces V; and denoted by | |;- Vi. The direct coproduct is a proper subset of 
Tlieq Vi when and only when the set 2 is infinite. If each of the spaces V; is equal 
to a given vector space V, we write V?) instead of Lieg Vi- 


Example If V is a vector space over a field F and if v € V, then the set Fv = {av | 
a € F} is a subspace of V which is contained in any subspace of V containing v. 


Example Let R be the field of real numbers and let 2 be either equal to R, to some 
closed interval [a, b] on the real line, or to a ray [a, ©) on the real line. We have 
already seen that the set R? of all functions from 2 to R is a vector space over R. 
The set of all continuous functions from {2 to R is a subspace of this vector space, 
as are the set of all differentiable functions from 2 to R, the set of all infinitely- 
differentiable functions from 2 to R, and the set of all analytic functions from £2 
to R. If a < b are real numbers, we will denote the space of all continuous functions 
from the closed interval [a, b] to R by C(a, b). If a € R we will denote the space of 
all continuous functions from [a, oo) to R by C(a, co). These spaces will be very 
important to us later. 


Proposition 3.3 If V is a vector space over a field F , then a nonempty subset 
W of V is a subspace of V if and only if it is closed under addition and scalar 
multiplication. 


Proof If W is a subspace of V then it is surely closed under addition and scalar 
multiplication. Conversely, suppose that it is so closed. Then for any w € W we 
have Oy = Ow € W and —w = (—1)w e W. The other conditions are satisfied in W 
since they are satisfied in V. 


With kind permission of the Archives of the Mathematisches Forschungsinstitut Ober- 
wolfach. 

The first fundamental research in spaces of functions was done by the 
German mathematician Erhard Schmidt, a student of David Hilbert, 
whose work forms one of the bases of functional analysis. 
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Proposition 3.4 If V is a vector space over a field F, and if {W; |i € Q} isa 
collection of subspaces of V , then (\;<q Wi is a subspace of V. 


Proof Set W = Nice W;. If w, y € W then, for each i € 2, we have w, y € W; and 
so w + y € Wi. Thus w + y € W. Similarly, if a € F and w € W then aw e W; for 
each į € 2, and so aw € W. 


We will also set the convention that the intersection of an empty collection 
of subspaces of V is V itself. Subspaces W and W’ are disjoint if and only if 
W N W' = {0y}. More generally, a collection {W; | i € 2} of subspaces of V is 
pairwise disjoint if and only if W; N W; = {Ov} for i # j in 2. (Note that disjoint- 
ness of subspaces of a given space is not the same as disjointness of subsets!) 


Now let us look at a very important method of constructing subspaces of vector 
spaces. Let D be a nonempty set of elements of a vector space V over a field F. 
A vector v € V is a linear combination of elements of D over F if and only if there 
exist elements v1, ..., Un of D and scalars aj,..., an in F such that v = Yaj Aivi. 
We will denote the set of all linear combinations of elements of D over F by FD. 
Note that if v € V then F {v} is the set Fv which we defined earlier. 

It is clear that if D is a nonempty set of elements of a vector space V over a 
field F then D C FD. Also, Oy € FD for any nonempty subset D of V, and it 
is the only vector belonging to each of the sets F D. To simplify notation, we will 
therefore define FØ to be {Oy}. If D’ C D then surely FD’ C FD. We also note 
that FD = F(D U {0y }) for any subset D of V. 


1 0 0 3 
Example If D= O;,) 1 and D’ = 21,13 are subsets of R3, then 
0 0 0 0 
a 
FD = FD' = b || a,b eR }. Indeed, 
0 


à 1 0 E 212 
bl=aļlo|+bl1 A ) 2|+2|3]| foralla,beR. 
0 0 ol 310 
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4 0 2 2 
2|=1|0|+1|2|+1ļ0 
4 4 0 0 
2 5 1 
=1|o|+6Dl|2ļ|+4ļ1 
0 0 1 


Thus we see that there may be several ways of representing a vector as a linear 
combination of elements of a given subset of a vector space. 


Proposition 3.5 Let D be a subset of a vector space V over a field F. Then: 
(1) FD is a subspace of V; 

(2) Every subspace of V containing D also contains F D; 

(3) FD is the intersection of all subspaces of V containing D. 


Proof If D = Ø then FD = {0y} and we are done. Thus we can assume that D is 
nonempty. It is an immediate consequence of the definitions that the sum of two lin- 
ear combinations of elements of D over F is again a linear combination of elements 
of D over F, and that the product of a scalar and a linear combination of elements 
of D over F is again a linear combination of elements of D over F. This proves (1). 
Moreover, (2) is an immediate consequence of (1) and Proposition 3.3, while (3) 
follows directly from (2). 


If D is a subset of a vector space V over a field F then the subspace FD of V is 
called the subspace generated or spanned by D, and the set D is called a generating 
set or spanning set for this subspace. In particular, we note that @ is a generating 
set for {Ov}. 


1 0 0 
Example Let F bea field. Then A = O;,;1],] 0 is a generating set for 
0 0 1 
1 1 0 
F? over F. The set B = 1/,);0),]1 is also a generating set for F? if the 
0 1 1 
1 
characteristic of F is other than 2, but not for F = GF(2), since | 0 | ¢ GF(2)B. 
0 
1 0 1 
The set D = O;,;1y,] 1 is not a generating set for F? for any field F 


since | 0 | ¢ FD. 
1 
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Often, in applications, we need to restrict ourselves to linear combinations of 
special type. For example, let V be a vector space over a field F and let D be a 
nonempty subset of V. An affine combination of elements of D is an element of 
V of the form ear aivi, where the v; are elements of D and the a; are scalars 
satisfying )~_, a; = 1. This is usually interpreted as a weighted average of the 
vectors v;. The set of all affine combinations of elements of D is called the affine 
hull of D and is denoted by affh(D). In general, this is not a subspace of V. One 
can, however, easily verify that affh(affh(D)) = affh(D) for any set D. 


Proposition 3.6 Let V be a vector space over a field F and let D and D2 
be subsets of V satisfying Dı C D2 C F Dı. Then F D; = F D2. 


Proof Since F D; is a subspace of V containing D2, we know by Proposition 3.5 
that F D2 C FDı. Conversely, any linear combination of elements of Dı over F 
is also a linear combination of elements of Dz over F and so FD, C F D2, thus 
establishing equality. 


In particular, we note that FD = F (F D) for any subset D of V. 


Proposition 3.7 (Exchange Property) Let V be a vector space over a field 
F and let v, w € V. Let D be a subset of V satisfying v e F(DU {w})\ FD. 
Then we F(DU{v}. 


Proof Since v € F(D U {w}) we know that there exist elements v1,..., Un of D 
and scalars a1,...,an,b in F satisfying the condition that v = Sj aivi + bw. 
Moreover, since v ¢ FD, we know that b 40 and so w = b™!v — X; _; baju; € 
F(DU {v}). 


A vector space V over a field F is finitely generated over F if it has a finite 
generating set. Finitely-generated vector spaces are often much easier to deal with 
by purely algebraic methods and therefore, in several situations, we will have to 
restrict our discussion to these spaces. 


Example If F is a field and n is a positive integer, then one sees that 
1 0 0 


0 1 0 
0 i 0 pees 0 is a finite generating set for F” over F, and so F” 
0 0 1 


is finitely generated over F. More generally, if V is a vector space finitely generated 
over a field F, say V = F{v),..., vg}, and if n is a positive integer, then 
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vI v2 Uk 0 0 0 
0 0 0 vı v2 Uk 
0 0 0 0 0 0 

0 0 0 

0 0 0 

vI v2 Uk 


is a generating set for V” over F having kn elements. 


Example If F is a field and if k and n are positive integers, then the vector space 
Mkxn(F) of all k x n matrices over F is finitely generated over F. Similarly, if V 
is a finitely-generated vector space over F, then the vector space Mxxn(V) is also 
finitely generated over F. 


Example For any field F, the vector space F” is not finitely generated over F. 


Example The field R is finitely generated as a vector space over itself, but is not 
finitely generated as a vector space over Q. 


Let V be a vector space over a field F. In Proposition 3.4, we saw that if 
{W; |i € 2} is a collection of subspaces of V then ();-¢ W; is a subspace of V. 
In the same way, we can define the subspace }°;-¢ Wi of V to be the set of all 
vectors in V of the form vi ca Wj, where A is a finite nonempty subset of §2 and 
w; € Wj; for each j € A. In other words, X` ;co Wi = F(Ujeg Wi). Indeed, from 
the definition of this sum, we see something stronger: if D; is a generating set for 
W; for each i € 2 then jeg Wi = F(U; eg Di). 

As a special case of the above, we see that if W; and W2 are subspaces of V, 
then W; + W2 equals the set of all vectors of the form wı + w2, where w; € Wi 
and w2 € W2. If both W; and W3 are finitely generated then W; + W2 is also finitely 
generated. By induction, we can then show that if W,,..., W, are finitely-generated 
subspaces of V, then )~?_, W; is also finitely generated. 


Proposition 3.8 Jf V is a vector space over a field F and if {W; |i € Q} isa 

collection of subspaces of V , then: 

(1) Wy is a subspace of X ico Wi forall h € Q; 

(2) If Y is a subspace of V satisfying the condition that Wn is a subspace of 
Y for allh € 2, then `-o Wi is a subspace of Y. 


Proof (1) is clear from the definition. As for (2), if we have a subspace Y satisfying 
the given condition, if A is a finite subset of 2, and if wj € Wj for each j € A, then 
wj € Y for each j and so }),-4wj € Y. Thus ijcg Wi CY. 
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Proposition 3.9 If V is a vector space over a field F and if W1, W2, and W3 
are subspaces of V , then: 

(1) (Wi + W2) + W3 = Wi + (W2 + W3); 

(2) W+ W= W+ W; 

(3) W3 N [W2 + (W1 A W3)] = (W1 N W3) + (W2 N W3); 

(4) (Modular law for subspaces): If W1 C W3 then 


W3 N (W2 + Wi) = Wi + (W2 N W3). 


Proof Parts (1) and (2) follow immediately from the definition, while part (4) is 
a special case of (3). We are therefore left to prove (3). Indeed, if v belongs 
to W3 N [W2 + (W1 N W3)], then we can write v = w2 + y, where w2 € Wz and 
y € Wı N W3. Since v, y € W3, it follows that w2 = v — y € W3, and so v= 
y + w2 € (W1 N W3) + (W2 N W3). Thus we see that W3 N [W2 + (W1 N W3)] € 
(W1 O W3) + (W2M W3). Conversely, assume that v € (W1 O W3) + (W2 N W3). Then, 
in particular, v € W3 and we can write v = wı + w2, where wı € W1 N W3 and w2 € 
W2 N W3. Thus v = wy + w2 € W3 N W2 + (W1 O W3). This shows that (W1 N W3) + 
(W2 N W3) C W3 O [W2 + (W1 N W3)], and so we have the desired equality. 


Exercises 


Exercise 56 
Is it possible to define on V = Z/(4) the structure of a vector space over GF(2) 
in such a way that the vector addition is the usual addition in Z/(4)? 


Exercise 57 

Consider the set Z of integers, together with the usual addition. If a € Q and 
k € Z, define a - k to be |a|k, where |a] denotes the largest integer less than or 
equal to a. Using this as our definition of “scalar multiplication”, have we turned 
Z into a vector space over Q? 


Exercise 58 

Let V = {0, 1} and let F = GF(2). Define vector addition and scalar multiplica- 
tion by setting v + v’ = max{v, v’}, Ov = 0, and lv = v for all v, v’ € V. Does 
this define on V the structure of a vector space over F? 


Exercise 59 
Let p > 2 and let V be a vector space over GF(p). Show that v 4 —v for all 
Ov AveEV. 


Exercise 60 
Let V = C(0, 1). Define an operation on V by setting fH g :x > 
max{ f (x), g(x)}. Does this operation of vector addition, together with the usual 
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operation of scalar multiplication, define on V the structure of a vector space 
over R? 


Exercise 61 

Let V be a nontrivial vector space over R. For each v € V and each complex 
number a + bi, let us define (a + bi)v = av. Does V, together with this new 
scalar multiplication, form a vector space over C? 


Exercise 62 

Let J be the unit interval [0,1] on the real line and let V = Rx/J. Define op- 
erations of addition and scalar multiplication on V as follows: (a, s) + (b, t) = 
(a +b, min{s, t}) and c - (a, s) = (ca, s). Is V a vector space over R? 


Exercise 63 

Let V = {i € Z|0 <i < 2"} for some given positive integer n. Define operations 
of vector addition and scalar multiplication on V in such a way as to turn it into 
a vector space over the field GF(2). 


Exercise 64 

Let V be a vector space over a field F. Define a function from GF(3) x V to 
V by setting (0, v) > Oy, (1, v) > v, and (2, v)  —v for all v € V. Does this 
function, together with the vector addition in V, define on V the structure of a 
vector space over GF(3)? 


Exercise 65 
Give an example of a vector space having exactly 125 elements. 


Exercise 66 
Let V = Q?, with the usual vector addition. If a + b/2 € Q(V2) and if 


c 2 c |_| ac+2bd i 2: 
[s] E Q4, set (a+ b/2) [s] = | beiad . Do these operations turn Q4 into 


a vector space over Q2)? 


Exercise 67 

Let V = RU {oo} and extend the usual addition of real numbers by defining 
v+ = œ +v = œ for all v € V. Is it possible to define an operation of scalar 
multiplication on V in such a manner as to turn it into a vector space over R? 


Exercise 68 ‘ ; i 
Let V =R?. || ; H € V andr ER, set | + H = ko ‘| and 


r B = k 7 os | . Do these operations define on V the structure of a vector 


space over R? If so, what is the identity element for vector addition in this space? 
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Exercise 69 

Let V = R and let o be an operation on R defined by a o b = a°b. Is V, together 
with the usual addition and “scalar multiplication” given by o, a vector space 
over R? 


Exercise 70 
Show that Z is not a vector space over any field. 


Exercise 71 
Let V be a vector space over the field GF(2). Show that v = —v for all v € V. 


Exercise 72 
In the definition of a vector space, show that the commutativity of vector addition 
is a consequence of the other conditions. 


Exercise 73 
Let W be the subset of RÌ consisting of all vectors an odd number of the entries 
in which are equal to 0. Is W a subspace of R5? 


Exercise 74 
Let F be a field and fix 0 < k € Z. Let W be the subset of F” consisting of all 
those functions f satisfying 


k-1 
fE+HD=> fü+j) 


j=0 


for each i € Z. Is W a subspace of F7? 


Exercise 75 
a 

Let W be the subset of R? consisting of all vectors | b | satisfying |a|+|b| = |cl. 
c 


Is W a subspace of R?? 


Exercise 76 

Let V = RË and let W be the subset of V containing the constant function x +> 0 
and all of those functions f € V satisfying the condition that f(a) = 0 for at most 
finitely-many real numbers a. Is W a subspace of V? 


Exercise 77 
ay ay bi 


Let V = : O<a eRy.Ifv=| : |andw= | : | belong to V, and 
as as bs 
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a,b, aji 
if c € R, set v + w = : and cv= | : |. Do these operations turn V into 

asbs as 
a vector space over R? 


Exercise 78 
How many elements are there in the subspace of GF(3)? generated by 


1 2 


21,/2] 7? 

1 1 
Exercise 79 
A function f € RÈ is piecewise constant if and only if it is a constant function 
xt» c or there exist a) < az < +--+ < an andcg,..., Cn in R such that 


co (ifx <a, 
fixe $c ifa; <x < ai forl<i<n, 
Cn ifan <x. 


Does the set of all piecewise constant functions form a subspace of the vector 
space R® over R? 


Exercise 80 

Let V be the vector space of all continuous functions from R to itself and let W 
be the subset of all those functions f € V satisfying the condition that | f (x)| < 1 
for all —1 < x < 1. Is W a subspace of V? 


Exercise 81 
ai 


Let W be the subspace of V = GF(2)° consisting of all vectors : | satisfying 
a5 
a aj =0.Is W a subspace of V? 


Exercise 82 
Let V = RË and let W be the subset of V consisting of all monotonically- 
increasing or monotonically-decreasing functions. Is W a subspace of V? 


Exercise 83 
Let V = RË and let W be the subset of V consisting of the constant function 
at 0, and all epic functions. Is W a subspace of V? 


Exercise 84 

Let V = RË and let W be the subset of V containing the constant function a +> 0 
and all of those functions f € V satisfying the condition that f (x) > f(—zr). Is 
W asubspace of V? 
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Exercise 85 

Let V = RË and let W be the subset of V consisting of all functions f satisfying 
the condition that there exists a real number c (which depends on f) such that 
|f (a)| < cla| for all a € R. Is W a subspace of V? 


Exercise 86 

Let V = RË and let W be the subset of V consisting of all functions f satisfying 
the condition that there exist real numbers a and b such that | f (x)| < a| sin(x)| + 
b| cos(x)| for all x > 0. Is W a subspace of V? 


Exercise 87 
Let F be a field and let V = F”, which is a vector space over F. Let W be the 
set of all functions f € V satisfying f (1) = f(—1). Is W a subspace of V? 


Exercise 88 

For any real number 0 < ¢ < 1, let V; be the set of all functions f € RÈ satisfying 
the condition that if a < b in R then there exists a real number u (a, b) satisfying 
| f(x) — f(y)| < ula, b)|x — y| for all a < x, y < b. For which values of t is V; 
a subspace of R5? 


Exercise 89 
Let U be a nonempty subset of a vector space V. Show that U is a subspace of 
V if and only if au + u’ € U for all u, u’ € U anda € F. 


Exercise 90 

Let V be a vector space over a field F and let v and w be distinct vectors in V. 
Set U = {(1 — t)v + tw | t € F}. Show that there exists a vector y € V such that 
{u + y | u € U} is a subspace of V. 


Exercise 91 
Let V be a vector space over a field F and let W and Y be subspaces of V?. Let 


v Pe ws F 
U be the set of all vectors [e] € V? satisfying the condition that there exists a 


" 


vector v” € V such that H e W and | € Y. Is U a subspace of V7? 


Exercise 92 

Consider R as a vector space over Q. Given a nonempty subset W of R, let 
W be the set of all real numbers b for which there exists a sequence a1, 42, ... 
of elements of W satisfying lim;—oo a; = b. Show that W is a subspace of R 
whenever W is. 


Exercise 93 

Let V be a vector space over a field F and let P be the collection of all sub- 
sets of V, which we know is a vector space over GF(2). Is the collection of all 
subspaces of V a subspace of P? 
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Exercise 94 
Let W be the set of all functions f € RN satisfying the condition that if f (i) 40 
then f(ji) £0 for all positive integers j. Is W a subspace of R? 


Exercise 95 
Let W be the set of all functions f € RN satisfying the condition that if f (i) = 0 
then f(ji) =0 for all positive integers j. Is W a subspace of RN? 


Exercise 96 


Let V be a vector space over a field F and let Y be the set of all matrices of the 


v1 v2 Ov 

form | Oy vı+v2 Oy | in M3x3(V). Is Y a subspace of M3x3(V)? 
Ov VI v2 

Exercise 97 


Let W be the set of all functions f € RE satisfying the following conditions: 
there exist positive real numbers a and b such that for all x € R satisfying |x| >a 
we have |f (x)| < b|x|. Show that W is a subspace of RÈ. 


Exercise 98 
Let W be a subspace of a vector space V over a field F. Is the set (V \ W)U {0y} 


necessarily a subspace of V? 


Exercise 99 

Let V be a vector space over a field F and let f be a function from V to the 
unit interval [0, 1] on the real line satisfying the condition that f (au + bv) > 
min{ f (u), f(v)} for all a,b € F and all u, v € V. Show that f (0y) > f(v) for 
all v € V and that if 0 < h < f (0y) then V, = {v € V | f(v) > h} is a subspace 
of V. 


Exercise 100 
Consider the elements f, g, h of Q defined by f : t> t — 1, g : t> t + 1, and 
h: t> t? + 1. Does the function t + t? belong to Q{ f, g, h}? 


Exercise 101 


1 1 
Let F = GF(3) and let D = 1,10 . For which scalars c is | 1 | a 
0 2 c 


linear combination of elements of D? 
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Exercise 102 


4 
Find a real number c such that | 3 | € RÌ is a linear combination of 
1 
3 —1 
1], 2 
c 1 


Exercise 103 
Find subsets D and D’ of R? such that R(D N D) RD ARD. 


Exercise 104 
Find subspaces W and Y of R? having the property that W U Y is not a subspace 
of R3. 


Exercise 105 
Let V be a vector space over a field F and let Oy # w e V. Given a vector 
ve V x Fu, find the set G of all scalars a € F satisfying F{v, w} = F{v, aw}. 


Exercise 106 
Let p be a prime integer and let V be a vector space over F = GF(p). Show that 
V is not the union of k subspaces, for any k < p. 


Exercise 107 

Let V be a vector space over a field F and let c and d be fixed elements of F. 
Define a new operation H on V by setting v Hv’ = cv + dv’. Is V, with this new 
vector addition and the old scalar multiplication, still a vector space over F? 


Exercise 108 

Let J be the closed unit interval [0, 1] on the real line. A function (a, b) > a o b 
from I x I to J is a triangular norm! if and only if the following conditions hold 
foralla,b,ceTl: 


(1) aol=a; 
(2) a<c implies thataob<cob; 
(3) aob=boa; 


(4) ao(boc)=(aob)oc. 

Given a vector space V over a field F, and given a triangular norm o on Z, 
a function f : V — I is a o-fuzzy subspace of V if and only if, for each v, w € V 
and each d € F, we have f(u+w) > f(v) o f(w) and f(dv) > f(v). Find 
a condition that a o-fuzzy subspace f of V must satisfy for the set {v € V | 
f(v) = a} to be a subspace of V for any a € I. 


‘Triangular norms play a very important part in the theory of probabilistic metric spaces and have 
important applications in statistics and in mathematical economics, as well as such areas as pattern 
recognition and capacity theory. 


3 Vector Spaces Over a Field 


Exercise 109 

Let V be a vector space over a field F and let D be a nonempty subset of V. 
A zero-sum combination of elements of D is an element of the form )°7_, aivi, 
where the v; are elements of D and the a; are scalars satisfying }-;_; a; = 0. The 
set z(D) of all zero-sum combinations of elements of D is called the zero-sum 
hull of D. Is it true that z(z(D)) = z(D)? Is z(D) necessarily a subspace of V? 


Exercise 110 

Let V be a vector space over a field F and let D be a nonempty subset of V. 
A uniform combination of elements of D is an element of the form )°7_, aivi, 
where the v; are elements of D and aj = - - - = an. The set u(D) of all uniform 
combinations of elements of D is called the uniform hull of D. Is it true that 
u(u(D)) = u(D)? Is u(D) necessarily a subspace of V? 


Exercise 111 
If we identify R? with the Euclidean plane in the usual way, and if v 4 w are two 
vectors in R*, show that affh({v, w} is the line passing through these two points. 


Exercise 112 

If we identify R? with the three-dimensional Euclidean space in the usual way 
and if v,w,y are distinct vectors in R? which are not collinear, show that 
affh({v, w, y}) is the plane determined by the three points. 


Exercise 113 
Let V be a vector space over a field F and let D be a subset of V containing Oy. 
Show that affh(D) is a subspace of V. 


Algebras Over a Field 


In general, a vector space does not carry with it the notion of multiplying two vectors 
in the space to produce a third vector. However, sometimes such multiplication may 
be possible. A vector space K over a field F is an F-algebra if and only if there 
exists a function (v, w) > ve w from K x K to K such that 

(1) ue(vu+w)=uev+uew; 

(2) utvjew=uew+veun; 

(3) a(v e w) = (av) è w = v è (aw) 

for all u, v, w € K anda E F. As in the proof of Proposition 2.3(1), these conditions 
suffice to show that Og è v = v èe Og = Ox forallve K. 

Note that the operation e need not be associative, nor need there exist an identity 
element for this operation. When the operation is associative, i.e. when it satisfies 
(4) ve(wey)=(vew)ey 
for all v, w, y € K, then the algebra is called an associative F-algebra. If an iden- 
tity element for e exists, that is to say, if there exists an element Ox #e € K satis- 
fying è e = v = e è v forall v € K, we say the F-algebra K is unital. In a unital 
F-algebra, as with the case of fields, the identity element must be unique. In this 
case, we can then identify F with the subset {ae |a € F} of K and we note that 
aev=vedforallve K andae F. 

If v is an element of an associative F-algebra (K,e) and if n is a positive in- 
teger, we write v” instead of v e--- ev (n factors). If K is also unital and has a 
multiplicative identity e, we set v? = e for all Ox Æ v € K. The element (0x)? is 
not defined. 

If ve w = w ev for all v and w in some F-algebra K, then the algebra is com- 
mutative. An F-algebra (K, e) satisfying the condition that v èe w = —w e v for all 
v, w € K is anticommutative. If the characteristic of F is other than 2, it is easy to 
see that this condition is equivalent to the condition that ve v = Ox forall ve K. 
Of course, in that case K cannot possibly be unital. 


J.S. Golan, The Linear Algebra a Beginning Graduate Student Ought to Know, 39 
DOI 10.1007/978-94-007-2636-9_4, © Springer Science+Business Media B.V. 2012 
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With kind permission of the Harvard University Archives, HUP (B. Peirce and C.S. Peirce); With kind per- 
mission of the American Mathematical Society (Dickson); With kind permission of the Bryn Mawr College 
Library, Special Collections (Noether). 


The first systematic study of associative algebras was initiated by the nineteenth-century 
American mathematician Benjamin Peirce and continued by his son, the mathematician 
and logician Charles Sanders Peirce. Other major contributors at the beginning of the 
twentieth century were the American mathematician Leonard Dickson, the Scottish math- 
ematician Joseph Henry Wedderburn, and the German mathematician Emmy Noether, 
generally known as “the mother of modern algebra”. 


If (K,e) is an associative unital F-algebra having a multiplicative identity e, 
and if v € K satisfies the condition that there exists an element w € K such that 
v èe w = w è v = e, then we say that v is a unit of K. As with the case of fields, such 
an element w, if it exists, is unique and is usually denoted by v~!. If v is a unit, then 
so is —v, for one immediately notes that (=v)! = —(v~!). Also, it is easy to see 
that if v and w are units of K, then so is v e w. Indeed, 

(vew)e (wo! ° v7!) = (v e (w ° w')) ev! 


1 1 


=(vee)ev =vev =e 


and similarly (wlev !)e(vew)=e,so(vew) !=wiev!.IfveK isa 
unit and if n > 1 is an integer, we write v~” instead of (v~!)”. Note that the Hua’s 
identity (Proposition 2.6) in fact holds in any associative unital F-algebra in which 
the needed inverses exist, since the proof relies only on associativity of addition and 
multiplication and distributivity of multiplication over addition. 

If (K,e) is an F-algebra, and v, w € K, then (v, w) forms a commuting pair 
if and only if ve w = w ev. Of course, if the algebra K is commutative, all pairs 
of elements commute, but in general that will not be the case. Note that if (v, w) 
is a commuting pair in a unital associative F-algebra (K, e) and v™! exists, then 
(v~!, w) is also a commuting pair. Indeed, (v7! ew) ev = v™! e (w e v) = v7! è 
(v e w) = w so w è v™! = [(v7! e w) è v] e v7! =v! è w. 

Example Any vector space V over a field F can be turned into an associative and 
commutative F-algebra which is not unital by setting ve w = Oy for all v, w € V. 


Example If F is a subfield of a field K, then K has the structure of an associa- 
tive F-algebra, with multiplication being the multiplication in K. Thus, C is an 
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R-algebra and Q(,/p) is a Q-algebra for every prime integer p. These algebras are, 
of course, unital. 


Example Let F be a field, let (K, e) be an F-algebra, and let 2 be a nonempty 
set. Then the vector space K® of all functions from 2 to K has the structure of 
an F-algebra with respect to the operation e defined by f e g : i> f(i) e g(i)for 
all i € 2. This F-algebra is associative if K is. If K is unital with multiplicative 
identity element e, then K® is also unital, with identity element given by the con- 
stant function i +> e. In particular, if F is a field and if 2 is a nonempty set then 
F® is an associative unital F-algebra with respect to the operation e defined by 
feg:itb f@Mg@Mfor allie 2. 


Example We have seen that the collection of all subsets of a given nonempty set 
A is a vector space over GF(2). It is in fact an associative and commutative unital 
GF(2)-algebra with respect to the operation N. The identity element with respect to 
this operation is A itself. 


Example Define an operation « on C (0, œo) by setting 


t 
fasi f f(t — u)g(u)du. 
0 


This turns C(0, œo) into an associative and commutative R-algebra, known as the 
convolution algebra on R. 


Example Let K be the vector space over R consisting of all functions in R® which 
are infinitely differentiable, and define an operation e on K by setting f è g = (fey 
(where ’ denotes differentiation). Then (K, e) is an algebra which is commutative 
but not associative. 


The collection of all operations e on a vector space V over a field F which turn 
V into an F-algebra will be studied in more detail in Chap. 20. 

Let F be a field. If (K, e) is an F-algebra, then a subspace L of K satisfying 
the condition that w è w’ € L for all w, w’ € L is an F-subalgebra of K. If (K, e) 
is a unital F-algebra, then L is a unital subalgebra if it contains the multiplicative 
identity element of K. 


Let F be a field. An anticommutative F-algebra (K, e) is a Lie algebra over F 
if and only if it satisfies the additional condition 


(Jacobi identity) ue(vew)+ve(weu)+we(uev)=0x; 


for all u, v, w € K. This algebra is not associative unless ue v = Ox forall u,v € K. 


42 4 Algebras Over a Field 


With kind permission of the Archives of the Mathematisches 
Forschungsinstitut Oberwolfach. 

Sophus Lie was a nineteenth-century Norwe- 
gian mathematician who developed mathematical 
concepts that provide the basic model for quan- 
tum theory and an important tool in differential 
geometry. They were independently defined by 
the nineteenth-century German teacher Wilhelm 
Wilhelm Karl Joseph Killing Dudley Littlewood Karl Joseph Killing, in connection with his work 
on non-Euclidean geometry. Another pioneer in the study of noncommutative algebras be- 
cause of their importance in physics was the twentieth-century British mathematician Dud- 
ley Littlewood. 


Example Let F be a field and let (K, *) be an associative F-algebra. Define a new 
operation è on K by setting v èe w = v x w—w*v. Then (K,e) is a Lie algebra 
over F, which is usually denoted by K~. The operation in K~ is known as the Lie 
product defined on the given F-algebra K. This example is very important because 
one can show that any Lie algebra over a field F can be considered as a subalgebra 
of a Lie algebra of the form K~ for some associative F-algebra K. (A proof of this 
result, known as the Poincaré—Birkhoff—Witt Theorem, is far beyond the scope of 
this book.) If v, w € K, then v e w = Ox precisely when v x w = w x v, in other 
words, precisely when (v, w) forms a commuting pair in (K, *). 


Lie algebras are of fundamental importance in the modeling problems in physics, 
and have many other applications; they are in the forefront of current mathematical 
research. One particular Lie algebra defined on RÌ? goes back to the work of Grass- 
mann. Define the structure of an R-algebra on RÌ with multiplication x given by 


ay by ab — a3b2 
a: | x | b2 | = | ab) —a1b3 
a3 b3 aib2 — abı 


This operation, called the cross product, has very important applications in physics 
and engineering. It is easy to check that the algebra (Rĉ, x) is a Lie algebra over R. 


1 0 0 
Note that if vı = | O |, v2 = | 1 |, and v3 = | O |, then, surely, vj x v2 = 
0 0 1 


U3, V1 X V2 = V3, and v3 x vı = V2. Moreover, the cross product is the only possi- 
ble anticommutative product which can be defined on R? and which satisfies this 
condition. Indeed, if e is any such product defined on R? then 
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3 
aibj(vi e vj) 


ETE 


a3 b3 j=l i=] j=l 
ayb3 — a3bz ay bı 
= a3bı — a1b3 =] 42 x bo 
aib — arb, a3 b3 
0 
Proposition 4.1 If v and w are nonzero elements of R?, then v x w= | 0 
0 
if and only if Rv = Rw. 
ai by 
Proof Suppose v = | a2 | and w= | b2 |. These vectors are nonzero and so one 
a b3 


of the entries in w is nonzero; without loss of generality, we can assume that bı Æ 0. 
Then a2b3 = a3b2 = a3b, = a,b3 = aıb2 =i abı = 0 and so, if we define c = a\b;', 
we have v = cw. Hence v € Rw. Moreover, c #0 so w=c7!v € Rv, proving the 
desired equality. Conversely, if Rv = Rw then there exists an 0 Æ d € R such that 

0 
w = dv. Then v x w=d(v x v) = | 0 

0 


The cross product is very particular to the vector space R3, and does not gener- 
alize easily to spaces of the form R” for n > 3, with the exception of n = 7, which 
we will see in a later chapter. 


An important non-associative algebra is the following: let F be a field of charac- 
teristic other than 2, and let (K, *) be an associative algebra. We can define a new 
operation è on K, called the Jordan product, by setting ve w = 5(v * w +w xv). 
Then (K, e) is a commutative F-algebra, usually denoted by K*, called the Jordan 
algebra defined by K. It is not associative in general, but does satisfy 


(Jordan identity) (vew)e(vev)=ve (w e(ve v)) 


for all v, w € K. Jordan algebras have important applications in physics. Note that if 
vxw = w*v, then vew = v*w. This observation will have important consequences 
later. In particular, if (K, ») is unital with multiplicative identity e, then ee v = 
v e e = v for all v € K, so K* is also unital. 
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© the estate of Friedrich Hund. Repro- 
duced with kind permission of Gerhard 
Hund (Jordan); With kind permission 
of the Archives of the Mathematisches 
Forschungsinstitut Oberwolfach (Jacob- 
son); With kind permission of the Amer- 
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Jordan algebras were developed 
by the twentieth-century German 
physicist Pascual Jordan, one of the fathers of quantum mechanics and quantum electrody- 
namics. The algebraic structure of Lie algebras and Jordan algebras was studied in detail by 
the twentieth-century American mathematicians Nathan Jacobson and A. Adrian Albert. 


We now come to an extremely important algebra. Let F be a field and let X 
be an element not in F, which we will call an indeterminate. A polynomial in X 
with coefficients in F is a formal sum f(X) = par a; X', in which the elements a; 
belong to F, and no more than a finite number of these elements differ from 0. The 
elements a; are called the coefficients of the polynomial. If all of the a; equal 0, then 
the polynomial is called the 0-polynomial. Otherwise, there exists a nonnegative 
integer n satisfying the condition that a, # 0 and a; = 0 for alli > n. The coefficient 
an is called the leading coefficient of the polynomial; the integer n is called the 
degree of the polynomial, and is denoted by deg( f). If the leading coefficient of a 
polynomial is 1, the polynomial is monic. The degree of the 0-polynomial is defined 
to be —co, where we assume that —oo < i for each integer i and (—oo) +i = —00 
for all integers i. If f(X) is a polynomial of degree n 4 —oo, we often write it 
as oai X i. The set of all polynomials in X with coefficients in F is denoted 
by F[X]. We identify the elements of F with the polynomials of degree at most 0, 
and so can consider F as a subdomain of F[X]. We can associate the 0-polynomial 
with the identity element 0 of F for addition and the polynomials of degree 0 with 
the nonzero elements of F and so, without any problems, consider F as a subset 
of F[X]. 


Example The polynomials 5X? +2X? + 1 and 5X? — X? + X +4 in Q[X] both have 
degree 3 and leading coefficient 5. Therefore, they are not monic. The polynomials 
X34+2X*+1 and X? — X? + X +4 in Q[X] are both monic and have the same 
degree 3. 


We define addition and multiplication of polynomials over a field as follows: 
if f(X) = ya X! and g(X) = 729; X! are polynomials in F[X], then 
f(X) + g(X) is the polynomial peer c,X', where c; = a; + bi for all i > 0 and 
f (X)g(X) is the polynomial 5°25 d; X', where di = Eio ajbj—; for all i > 0. It 
is easy to verify that these definitions turn FLX] into an associative and commutative 
unital F-algebra with the 0-polynomial acting as the identity element for addition 
and the degree-0 polynomial 1 acting as the identity element for multiplication. This 
algebra is an integral domain, that is, the product of two nonzero elements of F[X] 
is again nonzero. In general, an algebra having this property is said to be entire. 
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Thus, commutative, associative, entire, unital F'-algebras are integral domains. The 
converse of this is not true: Z is an integral domain which is not an F-algebra for 
any field F. Not every commutative and associative unital R-algebra is entire. In- 
deed, the functions f : at» max{a, 0} and g : at» max{—a, 0} are both nonzero 
elements of the R-algebra RIT} !, but their product is the 0-function. 

If f(X) = oa X! and g(X) =} 2o b; X! are polynomials in F[X] then we 
define the polynomial f(g(X)) to be paar ajg(X)!. Then, for any fixed g(X), the 
set F[g(X)] = {f(g(X)) | f(X) € F[X] is a unital subalgebra of F[X]. 

Note that every polynomial in F[X] is a linear combination of elements of the 
set B = {1, X, X2,...} over F, so B isa set of generators of F[X] over F. On the 
other hand, it is clear that no finite set of polynomials can be a generating set for 
F[X] over F, and so F[X] is not finitely generated as a vector space over F. 

We should remark that the formal definition of multiplication of polynomials 
does not translate into the fastest method of carrying out such multiplication in prac- 
tice on a computer, especially for polynomials of large degree. The problem of fast 
polynomial multiplication has been the subject of extensive research over the years, 
and many interesting algorithms to perform such multiplication have been devised. 
A typical such algorithm is Karatsuba’s algorithm, which is easy to implement on 
a computer: let f (X) and g(X) be polynomials in F[X], where F is a field. We can 
write these polynomials as f (X) = >", a; Xİ and g(X) = 7") b; X', where n is 
a nonnegative power of 2 satisfying n > max{deg( f), deg(g)}. (Of course, in this 
case an and b, may equal 0.) We now calculate f(X)g(X) as follows: 


(1) Ifn=1 then f(X)g(X) = ab, X? + (aobı + a, bo)X + aobo. 
(2) Otherwise, write f(X) = f\(X)X"/? + fo(X) and 


g(X) = g1(X)X"/* + go(X), 


where the polynomials fo(X), fi(X), go(X), and g1(X) are all of degree at 
most 1/2. 
(3) Recursively, calculate fo(X)go(X), fi(X)g1(X), and 


(fo + FOX) + 81)(X). 


(4) Then 


FOX) = X" (fig D(X) + XPE Co + FDC + 81) — fogo — figi] 
+ (fogo)(X). 


Indeed, if the multiplication of two polynomials of degree at most n using the defi- 
nition of polynomial multiplication takes an order of 2n? arithmetic operations (i.e., 
additions and multiplications), it is possible to prove that there exists a fixed positive 
integer c such that the multiplication of two polynomials of degree at most n using 
Karatsuba’s algorithm takes at most cn!>? arithmetic operations. If n is sufficiently 
large, the difference between these two bounds can be significant. 
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The main idea of Karatsuba’s algorithm lies in the recursive reduction of the de- 
grees of the polynomials involved. The method of recursive reduction has since been 
extended to fast algorithms in many other areas of mathematics. We will encounter 
it again when we consider the Strassen—Winograd algorithms for matrix multiplica- 
tion. 


With kind permission of Ekatherina Karatsuba. 


Anatoli Alexeevich Karatsuba is a contemporary Russian mathe- 
matician whose research is primarily in number theory. 


There are other highly-sophisticated algorithms for multiplying two polynomials 
of degree at most n in an order of n log() arithmetic operations. 


Proposition 4.2 (Division Algorithm) /f F is a field and if f (X) and g(X) # 
0 are elements of F[X], then there exist unique polynomials u(X) and v(X) 
in F[X] satisfying f(X) = g(X)u(X) + v(X) and deg(v) < deg(g). 


Proof Assume that f(X) = X £oa; Xİ and g(X) = °° bi Xİ are the given poly- 
nomials. If f(X) = 0 or if deg( f) < deg(g), choose u(X) = 0 and v(X) = f(X), 
and we are done. Thus we can assume that n = deg( f) > deg(g) =k, and will 
prove our result by induction on n. If n = 0 then k = 0, and therefore we can 
choose u(X) to be agb7 |, which is a polynomial of degree 0, and choose v(X) 
to be the 0-polynomial. Now assume, inductively, that n > O and that the proposi- 
tion has been established for all functions f(X) of degree less than n. Set A(X) = 
f(X) — anb,'|X"~* g(X). If this is the 0-polynomial, choose u(X) = anb, ' X” 
and let v(X) be the 0-polynomial. Otherwise, since deg( f) > deg(h), we see by the 
induction hypothesis that there exist polynomials v(X) and w(X) in F[X] satisfying 
h(X) = g(X)w(X) + v(X), where deg(g) > deg(v). Thus f(X) = [aja x + 
w(X)]g(X) + v(X), as required. 

We are left to show uniqueness. Indeed, assume that f (X) equals g(X)u,(X) + 
vı(X) and g(X)u2(X) + v2(X), where deg(v1) < deg(g) and deg(v2) < deg(g). 
Then g(X)[u1(X) — u2(X)] + [v1 (X) — v2(X) 1g (X )[luı (X) — u2(X)] + [vı (X) — 
v2(X)] equals the 0-polynomial. If we have u;(X) = u2(X) then vı (X) = v2(X), 
and we are done. Therefore, assume that uw; (X) Æ u2(X). But then, since deg(e[u1 — 
u1]) > deg(v; — v2) and since F[X] is entire, this is a contradiction. Thus we have 
established uniqueness. 


Let us emphasize that the set F[X] is composed of formal expressions and not 
functions. Every polynomial f(X) = )(?2p)a;X' € F[X] defines a corresponding 
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polynomial function in F* given by c> f(c) = Ti aic’ , but the correspondence 
between polynomials and polynomial functions is not bijective. Indeed, it is possi- 
ble for two distinct polynomials to define the same polynomial function. Thus, for 
example, if F = GF(2) then the distinct polynomials X, X 2x 3 ... all define the 
same function from F to itself, namely the function given by 0+ 0 and 1 +> 1. The 
degree of a polynomial function is the least of the degrees of the (perhaps many) 
polynomials which define that function. 


With kind permission of the Archives of the Mathematisches Forschungsinstitut Ober- 
wolfach. 


The first person to systematically consider the best methods of calcu- 
E lating f(c) for a polynomial f(X) € FLX] and for c € F, was the 
twentieth-century Russian mathematician Alexander Markovich Os- 
trowski. 


Let pı(X) and p2(X) be polynomials in F[X] and let c € F. If we set f(X) = 
pı(X) + p2(X) and g(X) = pı(X)p2(X) then it is clear that f(c) = pı (c) + p2(c) 
and g(c) = pı (c) p2 (6). 


Proposition 4.3 Let F be a field and let p(X) be a polynomial in F[X]. 
Then an element c of F satisfies the condition that p(c) = 0 if and only if 
there exists a polynomial u(X) € F[X] satisfying p(X) = (X — c)u(X). 


Proof By Proposition 4.2, we know that there exist polynomials u(X) and v(X) in 
F[X] satisfying p(X) = (X — c)u(X) + v(X), where deg(v) < deg(X — c) = 1. 
Therefore, v(X) = b for some b € F. If b = 0 then p(c) = (c — c)u(c) = 0. 
Conversely, if p(c) = 0 then 0 = p(c) = (c — c)u(c) + b = b and so p(X) = 
(x — c)u(X). 


As an immediate consequence of this result, we see that if F is a field and if 
p(X) € F[X], then the set of all elements c of F satisfying p(c) = 0 is finite and, 
indeed, cannot exceed the degree of p(X). 

Let F be a field. A polynomial p(X) € F[X] is reducible if and only if there 
exist polynomials u(X) and v(X) in F[X], each of degree at least 1, satisfying 
p(X) = u(X)v(X). Otherwise, the polynomial is irreducible. Many tests for the 
irreducibility of polynomials in Q[X] have been devised. One of the earliest and 
well-known is Eisenstein’s criterion: if p(X) = Y~"_) a; X' € Q[X], where each a; 
is an integer, and if there exists a prime integer q such that q does not divide ay, 
q divides a; for all 0 <i < n — 1, and q? does not divide ag, then p(X) is irre- 
ducible. (A proof of this can be found in books on abstract algebra.) Thus, using this 
criterion, we see that 3X? + 7X? + 49X — 7 is an irreducible polynomial in Q[X]. 
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Gauss’ brilliant student, Ferdinand Eisenstein, died of tuberculosis 
at the age of 29. 


Example If F = GF(5) then the polynomial X 34. +1 e€ F[X] is irreducible, a fact 
which can be established, if necessary, by testing all possibilities. However, when 
F = GF(3) it is easy to verify the factorization X? + X + 1 = (X +2)(X? + X +2), 
and thus see that the polynomial is reducible. 


Example If p(X) =u(X)v(X) in F[X], then surely p(X +c) =u(X +c)v(X +c) 
for any c € F, and so to prove that a polynomial p(X) is irreducible it suffices to 
prove that p(X + c) is irreducible for some c € F. For example, let q be a prime in- 
teger. The qth cyclotomic polynomial in Q[X] is defined to be 4 (X) = aa x 
We claim that this polynomial is irreducible. To see that this is so, we observe that 
P(X +1) =X! + S (i), which is irreducible by Eisenstein’s crite- 
rion. 


It is known that the number of monic irreducible polynomials of positive degree 
m in GF(p) equals N (p) = 1 >D u(d)p™/?, where the sum ranges over all integers 
d which divide m and the Möbius function u(d) is defined by 


1 ifd=1, 
u(d)= + (— 1k if d is the product of k distinct primes, 
0 otherwise. 


This means that the probability of a randomly-selected monic polynomial of degree 
m in GF(p)[X] being irreducible is N(p)/p™, which is roughly L, In particular, 
we note that for every positive integer m there exists at least one monic irreducible 
polynomial of degree m in GF(p). 

Any polynomial in F[X] can be written as a product of irreducible poly- 
nomials. How to find such a decomposition, especially in the case of poly- 
nomials over a finite field or over Q, is a very difficult and important prob- 
lem, which attracted such great mathematicians as Newton and which contin- 
ues to attract many important mathematicians until this day. Indeed, the prob- 
lem of factoring polynomials over finite fields into irreducible components has 
become even more important, since it is the basis for many current crypto- 
graphic schemes. There are algorithms, such as Berlekamp’s algorithm, which 
factor a polynomial f(X) € F[X], where F = GF(p”), in a time polynomial 
in p, n, and deg( f). Moreover, under various assumptions, such as the General- 
ized Riemann Hypothesis, polynomials of special forms can be factored much more 
rapidly. 
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A polynomial p(X) € F[X] of positive degree is completely reducible if and 
only if it can be written as a product of polynomials in F[X] of degree 1. Not every 
polynomial over every field is completely reducible. For example, the polynomial 
X? + 1 € Q[X] is not completely reducible. The field F is algebraically closed if 
every polynomial of positive degree in F[X] is completely reducible. The fields Q 
and R are not algebraically closed. The field C is algebraically closed, by a theorem 
known as the Fundamental Theorem of Algebra. This theorem is in fact analytic and 
not algebraic, and relies on various analytic properties of functions of a complex 
variable. Most of the great mathematicians of the eighteenth century—d’ Alembert, 
Euler, Laplace, Lagrange, Argand, Cauchy, and others—tried in vain to prove this 
theorem. The first proof was given by Gauss in his doctoral thesis in 1799. His proof 
was basically topological and relied on the work of Euler. During his lifetime, Gauss 
published several proofs of this theorem. 


With kind permission of the Archives of the Mathematisches 
Forschungsinstitut Oberwolfach. 

A “nearly algebraic” proof was given by Ger- 
man/American mathematician Hans Zassenhaus 
in 1969. Most proofs of the Fundamental Theorem 
of Algebra are existence proofs and do not give a 
constructive method of finding the degree-one fac- 
tors of a polynomial over an algebraically-closed 
field. The first constructive proof was given by the German mathematician Helmut Kneser 
in 1940. 


Example The field F = GF(2) is not algebraically closed since the polynomial 
X? + X +1 € F[X] is not completely reducible. 


Note that if a field F is algebraically closed then every polynomial function 
F — F defined by a polynomial of positive degree is epic. Indeed, let p(X) € F[X] 
be a polynomial of positive degree and let d € F. Then g(X) = p(X) — d is a poly- 
nomial of positive degree in F[X] and so there exists an element c of F such that 
q(c) = 0. In other words, p(c) =d. 

It is easy to see that a polynomial p(X) € R[X] of degree 1 is irreducible. If 
D(X) = aX*+bX +cisof degree 2 then, considering it as an element of C[X], we 
have 


(X) X+ b + : b? —4 X 4 + : b?-—4 
=a — — — 4a a $ 
p 2a Da ° 2a Ba Š 


Then this factorization holds in R[X] as well if and only if b? — 4ac > 0, and so 
p(X) is irreducible if and only if b? — 4ac < 0. From the following result we deduce 
immediately that there are no irreducible polynomials in R[X] of degree greater 
than 2. 
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Proposition 4.4 A monic polynomial p(X) € R[X] is irreducible if and only 
if it is of the form X — a or (X — a)? + b*, wherea e Rand 0#beER. 


Proof Clearly, every polynomial of the form X — a is irreducible. Now assume that 
f(X) = (X — a)? +b? = X? — 2a X +a? + b?. Were this polynomial reducible, we 
could find real numbers c and d satisfying 


f(X) = (X -A(X -d) =X? — (c +d)X +cd 


and so c + d = 2a and cd = a? + b?. This implies that c? — 2ac + a? + b? = 0 and 
hence c = 5[2a +./4a? — 4(a? + b?)] = a + y —b?, which contradicts the assump- 
tion that c € R since b is assumed to be nonzero. Thus polynomials of both of the 
given forms are indeed irreducible. 

Conversely, let p(X) = } ;_ociX ‘ be a monic irreducible polynomial in R[X] 
that is not of the form X — a. By the Fundamental Theorem of Algebra, we know 
that there exists a complex number z = a + bi satisfying p(z) = 0. Since the 
coefficients of p(X) are real, this means that p(z) = 0 as well, since 0 = 0= 
pz) = Xio cizi = p(Z). Thus there exists a polynomial u(X) € R[X] satisfying 
p(X) = (X — z)(X — Zu(X), where (X — z)(X —Z) = X? - Z+ 3X + zz = 
X? —2aX +a? +b’. Since p(X) was assumed irreducible, we conclude that z ¢ R 
(i.e., b £0) and that p(X) equals X? — 2aX + a? + b?, as desired. 


An obvious generalization of the above construction is the following: Let F be a 
field and let (K, e) be an associative and commutative unital F-algebra. If X is an 
element not in K, we can define a polynomial with coefficients in K as a formal sum 
f(X)= Do ai X', in which the elements a; belong to K and no more than a finite 
number of them differ from Og. The set of all such polynomials will be denoted 
by K[X]. As above, we define addition and multiplication in K[X] as follows: if 
fX) = oai X! and g(X) = } 2o bi Xİ belong to K[X], then f(X) + g(X) 
is the polynomial y c,X', in which c; = a; + b; for each 0 < i < œo, and 
f (X)g(X) is the polynomial )°?°, d; X', in which d; = S ù aj e bj—; for each 
0 <i <œ. Again, it is easy to check that K[X] is an F-algebra. Moreover, as a 
direct consequence of the definition of multiplication, we see that if K is entire then 
so is K[X]. This generalization allows us to consider algebras of polynomials in sev- 
eral commuting indeterminates with coefficients in K defined inductively by setting 
K[X,..., Xn] = K[X1,..., Xn—1][Xn] for each n > 1. Elements of this algebra 
are of the form 


SAX Xn) = Y Cin GR eee, 


where the sum ranges over all n-tuples (i1,..., in) of nonnegative integers and 
at most finitely-many of the coefficients a;,_;, E€ K are nonzero. The degree of 
f(X%,..., Xn) is the maximal value of 


fi) tes: in | diin #0} 
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A polynomial Lai, in X] o Xin in K[X1,..., Xn] is flat if and only if 


ti, # 0 only when each i; is either 0 or 1. 


seeoln 


Proposition 4.5 Let F be a field of characteristic other than 2 and let K be an 


associative and commutative unital entire F-algebra. Let f(X\,...,Xn) = 
D dinin X] +X" € K[X1,..., Xn] be a flat polynomial of degree n. Then 
for each n-tuple (c1, ...Cn) of nonzero elements of K there exist e1, ..., en in 
K, each equal to 1g or —|x, such that f (e1c1,..., nen) #9. 


Proof We will prove this result by induction on n. If n = 1, then f(X1) =a, X1 + 
dao, Where aj #0. If Ox # c € K then either ap + aic or ag — ajc is nonzero, for 
otherwise we would have 2a;c = 0x, which is impossible since K is entire and 
the characteristic of F is not 2. Hence the case n = 1 has been established. Now 
assume that n > | and that the proposition has been established for flat polynomi- 
als in F[X,,..., Xn—1]. We can write the polynomial f(X1,..., Xn) in the form 
g(X1,...,Xn-1) + A(X1, ..., Xn—1) Xn, where h(X1,..., Xn—1) is a flat polyno- 
mial in K[Xj,..., Xn—1] of degree n — 1. If (ci,..., cn) is an n-tuple of nonzero 
elements of K then, by the induction hypothesis, we can find e1, ...,e€n—1 in K, 
each equal to 1g or —1x, such that h(e,c1,...,@n—1Cn—1) Æ 0. But then we have 
g(€1C1,.--,@n—1Cn—1) + h(eici,..-,@n—1Cn—-1) Xn E€ K[Xn] and so, by the case 
n = 1, we can find e, equal to 1g or — 1g such that f (e1c1,..., E€nCn) £0. 
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Exercise 114 
Let F be a field and let (K, e) and (L, x) be F-algebras. Define an operation © 


f 


/ 
on K x L by | © A = peel Is (K x L, ©) an F-algebra? 


Exercise 115 
Let F be a field and let (K, e) be a unitary, associative, commutative, and entire 


F -algebra which, as a vector space, is finitely generated over F. Is K necessarily 
a field? 


Exercise 116 

Let F be a field and let (K, e) be an associative F -algebra which, as a vector 
space, is finitely generated over F. Given an element a € K, do there necessarily 
exist elements a1, a2 € K satisfying aj è a2 =a? 


Exercise 117 


Define an operation è on R? by setting H ° i = pe al Show that 


this operation turns R? into an R-algebra. Is this algebra associative? 
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Exercise 118 
Let F be a field and let (K, e) be a unital F-algebra. Define an operation © on 
the vector space L = K x F by setting | © ; = [” es T ad for 


all v, w € K anda,be F. Is L an F-algebra? Is it unital? 


Exercise 119 

Let F be a field. An F-algebra (K,e) is a division algebra if and only if for 
every v € K and for every Og # w € K there exist unique vectors x, y € K, not 
necessarily equal, satisfying w ex = v and ye w = v. Is the algebra defined in 
the previous exercise a division algebra? 


Exercise 120 
Let K be the subset of M4 x4(R) consisting of all matrices of the form 
a -b —c -d 


: a T 3 for a,b,c,d € R and let L be the subset of M4x4(R) 
d —c b a 
a —b -c -d 
bis : b a d — ; 
consisting of all matrices of the form c -d 4 bh Are K and L uni- 


d c —b a 
tal subalgebras of M4x4(R)? Are they division algebras? 


Exercise 121 

Let F be a field and let (K, e) be an associative unital F-algebra with multiplica- 

tive identity e. For units v, w € K, show that: 

(1) ve (v7! + w7!) = (v+ w) e w7™!; 

(2) (v+ w)! e w= v! e (v7! +w7!)7! whenever v + w and v™! + w7! are 
also a units; 

(3) ve w™! +e= voe (v7! +w!) 


Exercise 122 

Let F be a field and let (K, e) be an associative F-algebra which, as a vector 
space, is finitely generated over F. Suppose that there exists an element y € K 
satisfying the condition that for each v € K there exists an element v’ € K satis- 
fying v’ e y = v. Show that each such element v’ must be unique. 


Exercise 123 

Let F be an infinite field and let (K, x) be an associative unital F-algebra. 
If v, w € K, show that there are infinitely-many elements w’ of K satisfying 
vew=vew ink. 


Exercise 124 
Let F be a field and let (K, e) be an associative unital F-algebra. If A and B are 
subsets of K, we let Ae B be the set of all elements of K of the form ae b, with 
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a € A and be B (in particular, Ø e B = Ae Ø = Ø). We know that the set V of 
all subsets of K is a vector space over GF(2). Is (V, e) a GF(2)-algebra? If so, is 
it associative? Is it unital? 


Exercise 125 

Let F be a field and let (K, e) be an associative F-algebra. If V and W are sub- 
spaces of K, we let V e W be the set of all finite sums of the form a 1 Ui ® Wj, 
with v; € V and w; € W. Is V e W necessarily a subspace of K? 


Exercise 126 

Let (K, e) be an associative F-algebra and let v € K. If there exists an element 
y of K satisfying v e y e v = v, show that there also exists an element w of K 
satisfying v è w è v = v and w è v è w = w. 


Exercise 127 
For v, w € R?, simplify the expression (v + w) x (v — w). 


Exercise 128 
For u, v, w € R?, simplify the expression (u + v + w) x (v + w). 


Exercise 129 
Let F be a field and let (K, e) be an F-algebra satisfying the Jacobi identity. 
Show that K is a Lie algebra if and only if ve v = Ox for all v € K. 


Exercise 130 

Let F be a field and let (K, *) be an associative F-algebra. For each Op #c € F 
and define an operation e, on K by setting v ec w = c(v x w + w * v). For which 
values of c is (K, e) a Jordan algebra over F? 


Exercise 131 

Let F be a field and let (K, e) be a unitary F-algebra. For each v € K, let S(v) 
be the set of all a € F satisfying the condition that v — alx does not have an 
inverse with respect to the operation e. If v € K has a multiplicative inverse v™! 
with respect to this operation, show that either S(v) = Ø = S$ (w7!) or S(v) +Ø 
and S(v7!) = {a7! | a € S(v)}. 


Exercise 132 
Let F be a field and let L be the set of all polynomials f(X) € F[X] satisfying 
the condition that f(—a) = — f (a) for all a € F. Is L a subspace of F[X]? 


Exercise 133 
Let F be a field and let L be the set of all polynomials f(X) € F[X] satisfying 
the condition that deg( f) is even. Is L a subspace of F[X]? 
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Exercise 134 
Let F be a field and let f(X),g(X) € F[X]. Show that deg(fg) = 
deg( f) + deg(g). 


Exercise 135 
Let F be a field and let f(X), g(X) € F[X]. Show that deg(f + g) 
max{deg(f), deg(g)}, and give an example in which we do not have equality. 


IA 


Exercise 136 
Find polynomials u(X), v(X) € Q[X] satisfying 


X443X3 = (X? + X + 1)u(X) + v(X). 


Exercise 137 
Let F = GF(2). Find polynomials u(X), v(X) € F[X] satisfying 


X? + X? = (X? + X + 1)u(X) + v(X). 


Exercise 138 
Let F = GF(7). Find a nonzero polynomial p(X) € F[X] such that the polyno- 
mial function defined by p is the 0-function. 


Exercise 139 
Is the polynomial 6X4 + 3X? + 6X? + 2X + 5 € GF(7)[X] irreducible? 


Exercise 140 
Is the polynomial X 7 + X^ + 1 € Q[X] irreducible? 


Exercise 141 
Find t € R such that there exist a, b € R satisfying a + b = 1 and 2a? — a? — 
Ta +t=0=2b — b? —7b+t. 


Exercise 142 
For a field F, compare the subsets F[X?] and F[X? + 1] of FLX]. 


Exercise 143 

Let F = GF(p), where p is a prime integer, and let g be an arbitrary function 
from F to itself. Show that there exists a polynomial p(X) € F[X] of degree less 
than p satisfying the condition that g(c) = p(c) for all c € F. 


Exercise 144 
Let c be a nonzero element of a field F and let n > 1 be an integer. Show that 
there exists a polynomial p(X) € F[X] satisfying c” + c™” = p(c+c7). 
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Exercise 145 
Let F be a field. Find the set of all polynomials 0 4 p(X) € F[X] satisfying 
P(X?) = p(X)’. 


Exercise 146 
Let p(X) = aX (X — 1)--- (X — k + 1) € Q[X] for some positive integer k. 
Show that p(n) € Z for every nonnegative integer n. 


Exercise 147 
Let p(X) =nX"t! — (n + 1)X” + 1 € Q[X] for any positive integer n. Show 
that there exists a polynomial qn (X) € Q[X] satisfying p(X) = (X — 1)°qn (X). 


Exercise 148 

Let F be a field and let W be a nontrivial subspace of the vector space F[X] 
over F. Let p(X) € F[X] be a given monic polynomial and let p(X)W = 
{p(X) f(X) | f(X) € W}. Show that p(X)W is a subspace of F[X] and find 
a necessary and sufficient condition for it to equal W. 


Exercise 149 
Let p be a prime integer and let n be a positive integer. Does there necessarily 
exist an irreducible monic polynomial in GF(p)[X] of degree n? 


Exercise 150 

Let p be a prime integer and let n be a positive integer. Show that the product of 
all irreducible monic polynomials in GF(p)[X] of degree dividing n is equal to 
xP" _ x. 


Exercise 151 
Let n > 1 be an integer. Is the polynomial p(X) = 1 + X`} AX’ € Q[X] nec- 
essarily irreducible? 


Exercise 152 
Show that the polynomial X* + 1 is irreducible in Q[X] but reducible in 
GF(p)[X] for every prime p. 


Exercise 153 
Show that X4 + 2(1 — c)X* + (1 + c)? € Q[X] is irreducible for every c € Q 
satisfying /c ¢ Q. 


Exercise 154 

Let F be a field and let K = FN. Define operations + and e on K by setting 
f+g:i fO+e@ and f è g:i are f(j)g(k). Show that K is an 
associative and commutative unital F'-algebra. Is it entire? 
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Exercise 155 

Let k be a positive integer and let a < b be real numbers. A function f € R'“?! is 
a spline function of degree k if and only if there exist real numbers a = ay <--- < 
an = b and polynomials po(X),..., Pn—1(X) of degree k in R[X] satisfying 
the condition that f : x t+» p;(x) for all aj < x < ai; andallO<i<n-1. 
Spline functions play an important part in interpolation theory and in numerical 
procedures for solving differential equations. Is the set of all spline functions of 
fixed degree k a subspace of the vector space R!@-!? 


With kind permission of the Archives of the Mathematisches Forschungsinstitut Ober- 
wolfach. 

Spline functions were first defined and studied by the twentieth- 
century Romanian/American mathematician Isaac Jacob Schoen- 
berg. 


Exercise 156 
Let F be a finite field, let k > 1 be an integer, and let V be the vector space over 


F consisting of all polynomials in F[X] having degree less than k. Let a1, ..., an 
be distinct elements of F and let W be the subset of F” consisting of all vectors 
plai) 
of the form : for some p € V. Is W a subspace of F,,? 
plan) 


Exercise 157 

A trigonometric polynomial in R® is a function of the form t +> ag + 
DE [ap cos(ht) + bn sin(ht)], where ag,..., ap, b1, ..., bg € R. Show that the 
subset of RË consisting of all trigonometric polynomials is an entire R-algebra. 
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In this chapter, we will see how a restricted collection of vectors in a vector space 
over a field can dictate the structure of the entire space, and we will deduce far- 
ranging conclusions from this. Let V be a vector space over a field F. A nonempty 
subset D of V is linearly dependent if and only if there exist distinct vectors 
U1,---+,U, in D and scalars a1, ..., an in F, not all of which are equal to 0, sat- 
isfying )~/_, aivi = Ov. A list of elements of V is linearly dependent if it has two 
equal members or if its underlying subset is linearly dependent. Clearly, any set of 
vectors containing Oy is linearly dependent. A nonempty set of vectors which is not 
linearly dependent is linearly independent. That is to say, D is linearly independent 
if and only if D = Ø or D Æ Ø and we have }`;_; aj vj = Oy with the a; in F and 
the v; in V, when and only when a; = 0 for all 1 <i <n. As a consequence of this 
definition, we see that an infinite set of vectors is linearly dependent if and only if it 
has a finite linearly-dependent subset, and an infinite set of vectors is linearly inde- 
pendent if and only if each of its finite subsets is linearly independent. It is also clear 
that any set of vectors containing a linearly-dependent subset is linearly dependent 
and that any subset of a linearly-independent set of vectors is linearly independent. 


With kind permission of The Shelby White and Leon Levy Archives Center, USA. 


The notion of linear independence of vectors was introduced by Grass- 
mann; it was extensively generalized to other mathematical contexts 
the by the twentieth-century American mathematician Hassler Whit- 
ney. 


1 —1 —4 

Example The subset 214 3s 7 of Q? is linearly dependent 
1 4 11 
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0 1 —1 —4 
since | 0| = (—1)|2| +3 3) + (-1) 7 |. Similarly, the subset 
0 1 4 11 


1 
1],] 1 of Q? is linearly independent, since if 
0 


0 1 1 
O/=a|/0/+5]1]4+ec] 1], 
0 0 0 1 
0 a+b+c 
then | 0 | = b+c and this implies that a = b = c = 0. 
0 c 
1 1 0 
1 0 1 
0 1 1 
Example The subset 1},;1],] 1 of GF(2)’ is linearly independent and 
1 0 0 
0 1 0 
0 0 1 


generates a subspace of V composed of eight vectors: 


and 


o 565 6 © co 6 6 
SORF ORR 
ORF ORFFR OR 
RP OORRFRO 
meer OOO 
=e COCO = 
= Of OF Oe 
OrPrPOrRFROS 


Note that in every element of V other than its identity element for addition, a ma- 
jority of the entries are nonzero. This property makes this subspace of V important 
in algebraic coding theory. 


Example Let b > 1 be a real number, let {p1, p2,...} be the set of prime integers 
and, for each i, let u; = log,(p;). We claim that D = {u,u2,...} is a linearly- 
independent subset of R when it is considered as a vector space over Q. Indeed, 
assume that this is not the case. Then there are a positive integer n and rational 
numbers aj,...,@, not all equal to 0, satisfying ae aju; = 0. If we multiply 
both sides by the product of the denominators of the a;, we can assume that the a; 
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are integers. Then 


n n n 
l= b? = pè siui = pper“ = [[@")“ = [př 
i=l 


i=l i=l 


and this is a contradiction. Therefore, D must be linearly independent. 


Example Let F be a field and let 2 be a nonempty set. Let V; be a vector space 
over F for each i € 2, and set V = I]; <q Vi- We have already seen that the iden- 
tity for addition in this vector space is the function go : 2 > U;<g Vi given by 
go: i > Oy,. For each i € Q, let fi : 2 > Ujeg Vi be a function satisfying the con- 
dition that f; (i) Æ go(i) but fi (h) = go(A) forall h € 2x {i}. We claim that the sub- 
set { fj |i € 2} of V is linearly independent. To see this, assume that there exists a 
finite subset A of 2 and a family of scalars {cp | h € A} such that S rei Ch fn = 80- 
Then for each k € A we have go(k) = (ope, Ch fh Œ) = X pea Ch fa (k) = cr fk (k) 
and since, by definition, f(k) Æ go(k), we must have c = 0. 


Example If F is a field, the subset {1, X, X?,...} of F[X] is surely linearly inde- 
pendent, since }~'_. a; X' = 0 if and only if each of the coefficients a; equals 0. 


Example Let V = R? be the vector space, over R, of all functions from R to itself. 
Let D be the set of all functions of the form x +> e°“ for some real number a. We 
claim that D is linearly independent. Indeed, assume that there are distinct real num- 
bers a1, ..., an and real numbers c1, ..., Cn such that the function x => Xia cjeli” 
equals the 0-function fo : x +» 0, which is the identity element of V for addition. 
We need to show that each of the c; equals 0, and this we will do by induction on n. 

If n = 1 then we must have cı = 0 since the function x > e“ is different from 
fo for each a € R. Assume therefore that n > 1 and that every subset of D having 
no more than n — 1 elements is linearly independent. For each 1 <i <n, set bj = 
ai — an. Then 


n n—1 
jae? S ae =| X Ge |+ cn 
i=l i=l 


and if we differentiate both sides of the equation, we see that fo = at biciehi*., 
By the induction hypothesis and the choice of the scalars a; as being distinct, it 
follows that bic; = 0 Æ b; for each 1 <i < n — 1 and so c; =O forall 1 <i <n-—l1. 
This in turn implies that c, = 0 as well. 

Similarly, let G be the subset of V consisting of all of the functions of the form 
gi : x > x'~!2*-!, We claim that this set too is linearly independent. Indeed, as- 
sume otherwise. Then there exists a positive integer n and there exist real numbers 
C1,- , Cn, such that }7?_, cigi = fo. But this implies that 2*7! ($}_; cix'~!) = 0 
for each real number x. Since 2*7! # 0 for each x € R, we conclude that 
>, cix'~! = 0 for all x. But the polynomial function x => 7"_, cixi™! from 
R to itself has infinitely-many roots if and only c; = 0 for all i, proving linear inde- 
pendence. 
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Note that if {v, w} is a linearly-dependent set of vectors in an anticommutative 
algebra (K, e) over a field of characteristic other than 2, then there exist scalars a 
and b, not both equal to 0, such that av + bw = Ox. Relabeling if necessary, we can 
assume that b 4 0. Then Og =a(vev)+b(vew)=b(vew) andsovew=Ox. 
A simple induction argument shows that if D is a linearly-dependent subset of K 
then vı e--- ev, = Ox for any finite subset {v),..., vg} of D. 

Note too that Proposition 3.7 can be easily iterated to get the more general result 
that if D is a nonempty subset of a vector space V over a field F and if B is a finite 
linearly-independent subset of F D having k elements, then there exists a subset D’ 
of D also having k elements satisfying the condition that F((D \ D^) U B) = FD. 
Moreover, if D is linearly independent, so is (D \ D’) U B. This result is sometimes 
known as the Steinitz Replacement Property. 


Proposition 5.1 Let V be a vector space over a field F. A nonempty subset 
D of V is linearly dependent if and only if some element of D is a linear 
combination of the others over F. 


Proof Assume D is linearly dependent. Then there exists a finite subset {v1,..., Vn} 
of D and scalars a1, ..., an, not all of which equal 0, satisfying }`;—; aivi = Oy. 
Say an #0. Then vz = —a;," Žizn ajv; and so we see that v, is a linear combi- 


nation of the other elements of D over F. Conversely, assume that there is some 
element of D is a linear combination of the others over F. That is to say, there is 
an element vı of D, elements v2,..., Vn of D ~\ {vı} and scalars a2,...,d, in F 
satisfying vı = )~/_, aivi. If we set ay = —1, we see that )~"_, ajv; = Oy and so 
D is linearly dependent. 


Example For every real number a, let f, be the function in RÈ defined by fy : x > 
|x — a|. We claim that the subset D = { fa | a € R} of RÈ is linearly independent. 
Indeed, assume that this is not the case. Then there exists a real number b such that 
fp is a linear combination of other members of D. In other words, there exist a 
finite subset E of R \ {b} and scalars ca for each a € E such that fp = } aeg Ca fa- 
But the function on the right-hand side of this equation is differentiable at b, while 
the function on the left-hand side is not. From this contradiction, we see that D is 
linearly independent. 


If A is a nonempty set, then a relation x between elements of A is called a partial 
order relation if and only if the following conditions are satisfied: 
(1) a xa forall a € A; 
(2) Ifa xb and b xa then a = b; 
(3) Ifa xb and b xc thena xc. 
The term “partial” comes from the fact that, given elements a and b of A, it may 
happen that neither a = b nor b x a. A set on which a partial order has been defined 
is a partially-ordered set. A partially-ordered set A satisfying the condition that for 
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all a,b € A we have either a = b or b x a is called a chain. A nonempty subset 
B of a partially-ordered set A is itself partially-ordered relative to the partial order 
relation defined on A; it is a chain subset if it is a chain relative to the partial order 
defined on A. 

If A is a nonempty set on which we have a partial order relation = defined, then 
an element ao of A is maximal in A if and only if a9 x a when and only when 
a = ao. An element a, is minimal if and only if a x a, when and only when a = a1. 
Maximal and minimal elements need not exist or, if they exist, need not be unique. 
The Well Ordering Principle, one of the fundamental axioms of number theory, says 
that any nonempty subset of N, ordered with the usual partial order, has a minimal 
element. This principle is equivalent to the principle of mathematical induction. 

Partial order relations are ubiquitous in mathematics, and often play a very impor- 
tant, though not usually highlighted, part in the analysis of mathematical structures. 


Example Let A be a nonempty set and let P be the collection of all subsets of A. 
Define a relation < between elements of P by setting B x B’ if and only if B C B’. 
It is easy to verify that this is indeed a partial order relation. Moreover, P has a 
unique maximal element, namely A, and a unique minimal element, namely @. The 
set P is not a chain whenever A has more than one element since, if a and b are 
distinct elements of A, then {a} É {b} and {b} Z {a}. 


Example Let A = {1, 2,3} and let P be the collection of all subsets of A having one 
or two elements. Thus P has six elements: {1}, {2}, {3}, {1,2}, {1,3}, and {2, 3}. 
Again, the relation < between elements of P defined by setting B =< B’ if and only 
if B C B’ is a partial order relation. Moreover, P has three minimal elements: {1}, 
{2}, and {3}; it also has three maximal elements: {1, 2}, {1,3}, and {2, 3}. 


In general, if we have a collection of subsets of a given set, the collection is 
partially-ordered by setting B < B’ if and only if B C B’. Therefore, it makes sense 
for us to talk about “a minimal generating set” of a vector space V—namely a 
minimal element in the partially-ordered collection of all generating sets of V— 
and about “a maximal linearly-independent subset” of a vector space V—namely 
a maximal element of the partially-ordered collection of all linearly-independent 
subsets of V. However, we have no a priori guarantee that such minimal or maximal 
elements in fact exist. 


Example Consider the set A of all integers greater than 1, and define a relation = 
on A by setting k x n if and only if there is a positive integer t satisfying n = tk. 
This is a partial order relation on A. Moreover, A has infinitely-many minimal ele- 
ments, since each prime integer is a minimal element of A, while it has no maximal 
elements, since n x 2n for each n € A. 
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Proposition 5.2 Let V be a vector space over a field F. Then the following 
conditions on a subset D of V are equivalent: 

(1) D is a minimal set of generators of V; 

(2) D is a maximal linearly-independent subset of V; 

(3) D is a linearly-independent set of generators of V.. 


Proof (1) => (2): Let D be a minimal set of generators of V, and assume that D is 
linearly dependent. By Proposition 5.1, there exists an element vg € D which is a 
linear combination of elements of the set E = D ~ {vo} over F. Say vo = we 1 aii, 
where the u; belong to E and the a; are scalars in F. If v is arbitrary element of 
V then, since D is a set of generators of V, there exists elements v1,..., uv, of 
E and scalars bo, bj,..., by such that v = pee, bjvj. But this then implies that 
v = bovo + j= bj vj = Vins boaiui + Xi bjvj and so E is also a set of gen- 
erators of V, contradicting the minimality of D. This establishes the claim that D is 
linearly independent. If v € V \ D, the set D U {v} is linearly dependent since v is 
a linear combination of elements of D. Thus D is a maximal linearly-independent 
set. 

(2) > (3): Assume that D is a maximal linearly-independent subset of V. 

Consider a vector vo in V \ D. By (2), we know that the set D U {vo} is lin- 
early dependent, and so Oy € F(D U {vo} \ FD by Proposition 3.7, this implies 
vo E€ F(D U {0y }) = FD, which proves that D is a set of generators of V. 

(3) > (1): Assume that D is a linearly-independent set of generators of V 
and that E is a proper subset of D which is also a set of generators for V. Let 
vo E€ D~ E. Then there exist elements v1,..., Un of E and scalars aj,..., a, such 
that vo = )°_, aivi. But, by Proposition 5.1, this implies that the set D is linearly 
dependent, contradicting (3). Therefore, no such E exists and so D is a minimal set 
of generators of V. 


Proposition 5.3 Let V be a vector space over a field F and let D be a 
linearly-independent subset of V. If vo E V \ FD then the set D U {uo} is 
linearly independent. 


Proof Assume that this set is linearly dependent. Then there exist elements 
V1, ..., Un Of D and scalars ao, a1, ..., An, not all equal to 0, such that y aivi 
= Oy. The scalar ag must be different from 0, for otherwise D would be linearly 
dependent, which is a contradiction. Therefore, vo = Yai —dy la; vi E€ FD, which 
contradicts the choice of vo. Thus D U {vo} must be linearly independent. 


Proposition 5.3 has important implications. For example, let V be a vector space 
over a field F which is not finitely generated and let D = {v1,..., Vn} be a linearly- 
independent subset of V. Then FD Æ V, since V is not finitely generated, and so 
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there exists a vector v,4; E€ V \ FD such that {v1,..., Vn+1} is linearly indepen- 
dent. Thus we see that a vector space which is not finitely generated has linearly- 
independent finite subsets of arbitrarily-large size. 

A generating set for a vector space V over a field F which is also linearly inde- 
pendent, is called a basis of V over F. In Proposition 5.2, we gave some equivalent 
conditions for determining of a subset of a vector space is a basis. However, we have 
not yet proven that every (or, indeed, any) vector space must have a basis. 


1 0 0 1 1 1 
Example Clearly, O;,; 11,10 and O;,; 14], 1 are bases 
0 1 0 0 1 


of F? for any field F. If the characteristic of F is other than 2, then 
1 1 0 


1],;0],) 1 is a basis of F 3 but if the field F has characteristic 2, 
0 1 1 
1 0 
then the set is linearly dependent, since | 1 } +] O0]+]1]=] 0 
0 1 1 0 


Example Let F be a field and let both k and n be positive integers. For each 
1 <s <k andeach 1 <t <n, let Hs; be the matrix [a;;] in Mxxn(F) defined by 


vali EIUN, 
~~ |O otherwise. 


Then {H,,|1<s <k and 1 <t <n} isa basis of Mgxn (F). 


Example If F is a field, then we have already seen that the subset {1, X, X 2. .} of 
F[X] is a linearly-independent generating set for F[X] as a vector space over F, and 
so is a basis of this space. The same is true for the subset {1, X + 1, X24X4+1,.. D 
of F[X]. More generally, if {po(X), pı(X), ...} is a subset of F[X] satisfying the 
condition that deg(p;(X)) =i for all i > 0, then it is a basis of F[X] as a vector 
space over F. 


Since every element of a vector space V over a field F has a unique representa- 
tion as a linear combination of elements of a basis, if one wants to define a structure 
of an F-algebra on V it suffices to define the product of any pair of basis elements, 
and then extend the definition by distributivity and associativity. This is illustrated 
by the following example, and we will come back to it again in Proposition 5.5. 


Example We have already noted that if F is a field then F[X] is an associa- 
tive F-algebra. Let us generalize this construction. Let H be a nonempty set on 
which we have defined an associative operation +. Thus, for example, H could 
be the set of nonnegative integers with the operation of addition or multiplica- 
tion. Let V be the vector space over F with basis {v | h € H} and define an op- 
eration è on V as follows: if v = een dgVg and w = hen bhun are elements 
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of V (where at most finitely-many of the ag and the by, are nonzero), then set 
vew= digeH hex agbhVgxh. This turns V into an associative F-algebra. In the 
case H = {X' |i > 0}, we get F[X]. Such constructions are very important in ad- 
vanced applications of linear algebra. 


Note that a vector space may have (and usually does have) many bases 
and so the problem arises as to whether there is a preferred basis among all of 


these. For vector spaces of the form F”, there are reasons to prefer the basis 
1 0 0 


0 1 0 
0 : 0 gated 0 ; for vector spaces of the form Mx x, (F) there are rea- 
0 0 1 


sons to prefer the basis {Hsr | 1 < s < k and 1 < t <n} defined above; and for vector 
spaces of the form F[X] there are reasons to prefer the basis {1, X, X7,...}. These 
bases are called the canonical bases of their respective spaces. However, in various 
applications—especially those involving large calculations—it is often convenient 
and sometimes extremely important to pick other bases which fit the problem un- 
der consideration. Indeed, in applications many considerations arise in choosing a 
basis D for a given vector space V. For example, we would like representation of 
elements of V as linear combinations of elements of the basis to be stable under 
perturbations of the coefficients. That is to say, if v = )~?_, ajv;, where the v; are 
elements of D, and if a; is a scalar near a; for each 1 <i < n, then we would 
like yj a;v; to be, in some sense, near v. (What “near” means here depends on 
notions of distance arising from the particular situation under consideration.) This 
is especially important if our data is based on observation or measurement which 
is not assumed to be entirely accurate. For instance, we might want to choose the 
basis taking into account the fact that the coefficient of vz is much more dubious 
than the coefficients of the other basis elements, or choose it so that all of the co- 
efficients a; be of the same numerical order of magnitude for those vectors v in 
which we are really interested and for which we will have to do extensive calcula- 
tion. 

It is also important to emphasize another point. When we defined the notation 
for this book, we stressed that when a set is defined by listing its elements, the set 
comes with an implicit order defined by that listing. When we deal with bases, and 
especially finite bases, the order in which the elements of the basis are written often 
plays a critical role, and one should never lose track of this. 


Proposition 5.4 Let V be a vector space over a field F and let D be a 
nonempty subset of V. Then D is a basis of V if and only if every vector in V 
can be written as a linear combination of elements of D over F in precisely 
one way. 
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Proof First, let us assume that D is a basis of V and that there exists an el- 
ement v of V which can be written as a linear combination of elements of D 
over F in two different ways. That is to say, that there exists a finite subset 
{v1,...,Un} of D and there exist scalars a},...,@n,b,,...,b, in F such that 


v — v = (Ly aivi) — Oo bivi) = 7_ [ai — bi]v;, where at least one of the 
scalars a; — b; is nonzero. This contradicts the assumption that D is a basis and 
hence linearly independent. Therefore, every vector in V can be written as a linear 
combination of elements of D over F in precisely one way. 

Conversely, assume that every vector in V can be written as a linear combination 
of elements of D over F in precisely one way. That certainly implies that D is a 
generating set for V over F. If {vj,..., Un} is a subset of D and if aj,..., an are 
scalars satisfying }~;_, ajvj = Oy, then we have }~"_, ajv;j = )~/_, Ov; and so, by 
uniqueness of representation, we have a; = 0 for each 1 <i <n. This shows that D 
is linearly independent and so a basis. 


We can look at Proposition 5.4 from another point of view. Let D be a nonempty 
subset of a vector space V over a field F, and define a function 6 : FP? > V by 
setting 0: f œ> X „ep f (u)u. (This sum is well-defined since only finitely-many of 
the summands are nonzero.) Then: 

(1) The function 0 is monic if and only if D is linearly independent; 
(2) The function 0 is epic if and only if D is a generating set; 
(3) The function @ is bijective if and only if D is a basis. 


Proposition 5.5 Let D be a basis for a vector space V over a field F. Then 
any function f : D x D —> V can be extended in a unique manner to a func- 
tion V x V — V which defines on V the structure of an F -algebra. Moreover, 
all F -algebra structures on V arise in this manner. 


Proof Let D = {y; | i € 2}. Suppose that we are given a function f : D x D— V. 
We define an operation e on V as follows: if v, w € V, then, by Proposition 5.4, we 
know that we can write v = )0;-9 aj yj and w = Vien bj; yj in a unique manner, 
where the a; and b; are scalars, only a finite number of which are nonzero; then set 
vew= Vico XY jeg Uj f Oi, yj). It is straightforward to show that this defines 
the structure of an F'-algebra on V. Conversely, if (V, e) is an F-algebra, define the 
function f : D x D > V by f : (Yi, yj) > yi è yj. 


The function f in Proposition 5.5 is the multiplication table of the vector multi- 
plication operation e with respect to the basis D. 


Example Let F be a field and let a, b € F. Let B = {v1, v2, v3, v4} be the canonical 
basis for F* over F. Define an operation e on B according to the multiplication 
table: 
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e Vi v2 U3 U4 
vI vI vi v3 v4 
v2 v2 avı v4 av3 
U3 U3 —vU4 bv, —bv2 
v4 v4 —av3 bv, —abvı 


and extend this operation to F4 by setting 


4 4 4 4 
X ain; ° X bjyj =} aibj(vi © vj). 
j=l 


i=l i=l j=l 


Then F*, together with this operation, is a unital associative algebra known as a 
quaternion algebra over F, in which v; is the identity element of for multiplication. 
In the special case of F = R anda = b = —1, we get the algebra of real quaternions, 
which is denoted by H. The algebra of real quaternions was first defined by Hamil- 
ton in 1844 as a generalization of the field of complex numbers (and earlier studied 
by Gauss, who did not publish his results). It is a division algebra over R since every 
nonzero quaternion is a unit of H. These were subsequently generalized by Clifford 
and used in his study of non-Euclidean spaces. Lately, they have also been used in 
computer graphics and in signal analysis. If F is a field having characteristic p > 0, 
quaternion algebras over F are not even entire. However, they arise naturally in the 
theory of elliptic curves, and so are of great importance in cryptography. If p > 2, 
then no quaternion algebras over F are commutative. 


With kind permission of the Spe- 
cial collections, Fine Arts Library, 
Harvard University (Tait); With 
kind permission of the London 
Mathematical Society (Clifford). 
Sir William Rowan Hamil- 
ton, a nineteenth-century 
Irish mathematician and 
physicist, helped create ma- 
trix theory in its modern formulation, together with Cayley and Sylvester. Hamilton was 
the first to use the terms “vector” and “scalar” in an algebraic context. His championship of 
quaternions as an alternative to vectors in physics was later taken up by Scottish mathemati- 
cian Peter Guthrie Tait. The nineteenth-century British mathematician William Kingdon 
Clifford was one of the first to argue that energy and matter were just different types of 
curvature of space. 


We now show that any vector space over a field F has a basis. Indeed, the fol- 
lowing two propositions show somewhat stronger than that. 
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Proposition 5.6 Jf V is a vector space finitely generated over a field F then 
every finite generating set of V over F contains a basis of V. 


Proof Let V be a vector space finitely generated over a field F and let D be a finite 
generating set for V over F. If D is minimal among all generating sets for V, then 
we know by Proposition 5.2 that it is a basis of V. If not, it properly contains other 
generating sets for V over F, one of which, say E, has the fewest elements. Then 
E cannot properly contain any other generating set for V over F, and so it must be 
a basis of V. 


Proposition 5.7 If V is a vector space finitely generated over a field F then 
every linearly-independent subset B of V is contained in a basis of V over F. 


Proof By assumption, there exists a finite generating set {v),..., Un} for V over F. 
Let B be a linearly-independent subset of V. If v; € FB for each 1 <i <n, then 
FB = V, and B is itself a basis of V. Otherwise, let h = min{i | v; ¢ FB}. By 
Proposition 5.3, the set D = B U {up} is linearly independent. If it is a generating set 
for V, then it is a basis and we are done. If not, let k = min{i | v; ¢ F D}, and replace 
D by BU {vp, vg}. Continuing in this manner, we see that after finitely-many steps 
we obtain a basis of V. 


With kind permission of the Department of Mathematics, University of Torino, Italy. 


The Italian mathematician Giuseppe Peano, best known for his ax- 
iomatization of the natural numbers, was the first to prove that every 
finitely-generated vector space has a basis at the end of the nineteenth 
century. He also gave the final form for the definition of a vector space, 
which we used above. 


We now want to extend this result to vector spaces which are not finitely gener- 
ated, and to do so we have to make use of an axiom of set theory known variously 
as the Hausdorff Maximum Principle or Zorn’s Lemma. To state this principle, we 
need another concept about partially-ordered sets. Let A be a set on which we have 
defined a partial order =. A subset B of A is bounded if and only if there exists an 
element ao € A satisfying b = apo for all b € B. Note that we do not require that ag 
belong to B. The Hausdorff maximum principle then says that if A is a partially- 
ordered set in which every chain subset is bounded, then A has a maximal element. 
Again, this is not really a “principle” or a “lemma”; it is an axiom of set theory which 
has been shown to be independent of the other (Zermelo—Fraenkel) axioms one usu- 
ally assumes. Indeed, it is logically equivalent to the Axiom of Choice, which we 
mentioned in Chap. 1 as being somewhat controversial among those mathematicians 
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dealing with the foundations of mathematics. However, in this book, we will assume 
that it holds. Given that assumption, we can now extend Proposition 5.7. 


With kind permission of the Hausdorff Research Institute for 
Mathematics (Hausdorff); © Jens Zorn (Zorn). 

Felix Hausdorff, one of the leading mathemati- 
cians of the early twentieth century and one of the 
founders of topology, died in a German concen- 
tration camp in 1942. Max Zorn, a German math- 
ematician who emigrated to the United States, 
made skillful use of the Hausdorff Maximum 
Principle in his research, turning it into an impor- 
tant mathematical tool. 


Proposition 5.8 If V is a vector space over a field F then every linearly- 
independent subset B of V is contained in a basis of V. 


Proof Let B be a linearly-independent subset of V and let P be the collection of 
all linearly-independent subsets of V which contain B, which is partially-ordered 
by inclusion, as usual. Then P is nonempty since B € P. Let Q be a chain subset 
of P. We want to prove that Q is bounded in P. That is to say, we want to find a 
linearly independent subset E of V which contains every element of Q. Indeed, let 
us take E to be the union of all of the elements of Q. To show that E is linearly 
independent, it suffices to show that every finite subset of E is linearly independent. 
Indeed, let {v,,..., v,} be a finite subset of E. Then for each 1 < i < n, there exists 
an element D; of Q containing v;. Since Q is a chain, there exists an index h such 
that D; C Dy for all 1 <i <n and so v; € Dp for all 1 <i < n. Therefore, this set is 
a subset of a linearly-independent set and so is linearly independent. Thus we have 
shown that every chain subset of P is bounded and so, by the Hausdorff maximum 
principle, the set P has a maximal element. In other words, there exists a maximal 
linearly-independent subset of V containing B, and this, as we know, is a basis of 
V over F. 


Taking the special case of B = Ø in Proposition 5.8, we see that every vector 
space has a basis. In the above proof we used the Axiom of Choice to prove this 
statement. In fact, one can show something considerably stronger: in the presence 
of the other generally-accepted axioms of set theory, the Axiom of Choice is equiv- 
alent, in the sense of formal logic, to the statement that every vector space over any 
field has a basis. 
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© A. Blass. 


The above result is due to the contemporary American mathematician, 
Andreas Blass. 


Example Consider the field R as a vector space over its subfield Q. A basis for 
this space is known as a Hamel basis. By Proposition 5.8, we know that Hamel 
bases exist, but nobody has been able to come up with a method of specifically 
constructing one. The subset C of R consisting of all real numbers which can be 
represented in the form Diso u;3~', where each u; is either 0 or 2, is called the 
Cantor set, and it can be shown to be “sparse” (in a technical sense of the word we 
won’t go into here) in the unit interval [0, 1] in R. It is possible to show that there is 
a Hamel basis of R contained in C. 

The existence of Hamel bases leads to some very interesting results, as the fol- 
lowing shows. Indeed, let H be a Hamel basis of R. If r € R then we can write 
r = J acg qa(r)a, where qa(r) € Q and there are only finitely-many elements 
a € H for which qa(r) # 0. Since such a representation is unique, we see that 


1 ifa=b, 
0 otherwise 


qa(b) = | 


for a,b € H. Moreover, if r,s € R anda € H then qa(r + s) = qa(r) + ga(s) so, if 
a #b are elements of H then for any r € R we have qa(r + b) = qa (r) + qa (b) = 
qa(r). Thus we see that the function qa € RË is periodic, with period b for any 
b € H ~ {a}, and its image is contained in Q. Moreover, if we pick two distinct 
elements c and d of H, we see that for each r € R, we have r = f(r) + g(r), where 
f,g € RË are defined by f : r> qe(r)c and g :r 1> Doe) qa(r)a. By our 
previous comments, f is periodic with period d and g is periodic with period c. 
We conclude that the identity function in RË is the sum of two periodic functions. 
A somewhat more sophisticated argument along the same lines shows that any poly- 
nomial function in RË of degree n is the sum of n + 1 periodic functions. Of course, 
since we cannot specify H, there is no way of finding these periodic functions ex- 
plicitly. 


© Professor Richard von Mises. 


The twentieth-century German mathematician Georg Hamel was a 
student of Hilbert who worked primarily in function theory. In his later 
years, he became notorious for his pro-Nazi views and activities. 
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We have seen that a vector space over a field can have many bases. We want 
to show next that if the vector space is finitely generated, then all of these bases 
are finite and have the same number of elements. First, however, we must prove a 
preliminary result. 


Proposition 5.9 Let V be a vector space over a field F which is generated by 
a finite set B = {v,,..., Un} and let D be a linearly independent set of vectors 
in V. Then the number of elements in D is at most n. 


Proof Suppose that D has a subset E = {w1,..., Wn41} having more than n ele- 
ments. Since this set must also be linearly independent, we know that none of the 
w; equals Oy. For each 1 < k <n, set Dy = {w1, ..., Wk, Uk+1, ---, Un}. 

Since B is a generating set for V, we can find scalars a1, . . . , an, not all equal to 0, 
such that w; = )~"_, aivi. In order to simplify our notation, we will renumber the 
elements of B if necessary so that a; 4 0. Then vj = aj’ wi—} aya; v; and so 
DC FD,. But Dı C V = FD and so V = FD, by Proposition 3.6. Now assume 
that 1 < k <n and that we have already shown that V = F Dz. Then there exist 
scalars b1, ..., bn, not all equal to 0, such that w+) = ye biwi + pare bivi. 
If the scalars bg+1,...bn are all equal to 0, then we have shown that D is linearly 
dependent, which is not the case. Therefore, at least one of them is nonzero and, by 
renumbering if necessary, we can assume that bgz+1 #0. Thus vk+1 = bzi 1Wk+1 — 
D bey bi Wi — Lita bzi bi vi and so, using the above reasoning, we get V = 
F Dg+1. Continuing in this manner, we see that after n steps we obtain V = FD, = 
F{w1,..., Wn}. But then wn+ı € F{w1,..., Wn} and so E is linearly dependent, 
contrary to our assumption. This proves that D can have at most n elements. 


Proposition 5.10 Let V be a vector space finitely generated over a field F. 
Then any two bases of V have the same number of elements. 


Proof By hypothesis, there exists a finite generating set for V over F having, say, 
n elements. If B is a basis of V then, by Proposition 5.9, we know that B has at 
most n elements and so, in particular, is finite. Suppose B and B’ are two bases 
for V having h and k elements, respectively. Since B is linearly independent and 
B’ is a generating set, we know that h < k. But, on the other hand, B’ is linearly 
independent and B is a generating set, sok < h. Thus h = k. 


We should remark at this point that the assertion for linearly-dependent sets cor- 
responding to Proposition 5.10 is not true. That is to say, a finite linearly-dependent 
set of vectors may have two minimal linearly-dependent subsets with different num- 
bers of elements. Indeed, there is no efficient algorithm to find such subsets of a 
given linearly-dependent set. Minimal linearly-dependent sets of vectors are often 
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called circuits because of applications to graph theory. We should also note that 
Proposition 5.9 is a special case of a more general theorem: If V is a vector space 
(not necessarily finitely generated) over a field F then there exists a bijective func- 
tion between any two bases of V. The proof of this result makes use of techniques 
from advanced set theory, such as transfinite induction. 

If V is a vector space finitely generated over a field F then V is finite dimen- 
sional and the number of elements in a basis of V is called the dimension of V 
over F. If V is not finite dimensional, it is infinite dimensional. (In choosing this 
latter terminology, we are deliberately skipping over the subject of various transfi- 
nite dimensions, since the reader is not assumed to be familiar with the arithmetic 
of transfinite cardinals. In certain mathematical contexts, distinction between infi- 
nite dimensions—for example the distinction between spaces of countably-infinite 
and uncountably-infinite dimension—can be very significant. We will not, however, 
need it in this book.) We denote the dimension of V over F by dim(V), or by 
dimz (V) when it is important to emphasize the field of scalars. 


With kind permission of the Archives of the Mathematisches Forschungsinstitut Ober- 
wolfach. 

The notion of dimension was implicit in the work of Peano, but was 
redefined and studied in a comprehensive manner by the twentieth- 
century German mathematician Hermann Weyl. 


Notice that the proof of Proposition 5.9, which is in turn critical in proving Propo- 
sition 5.10, uses the fact that every nonzero element of F has a multiplicative in- 
verse, and this cannot be avoided. If we try to weaken the notion of a vector space 
by allowing scalars to be, say, only integers, it may happen that such a space would 
have two bases of different sizes and so we could no longer define the notion of 
dimension in an obvious manner. We did not use, in an unavoidable manner, the 
commutativity of scalar multiplication and so we could weaken our notion of a vec- 
tor space to allow scalars which do not commute among themselves, such as scalars 
coming from H. However, the generality thus gained does not seem to outweigh the 
bother it causes, and so we will refrain from doing so. Thus, for us, the fact that 
scalars always come from a field is critical in the development of our theory. 


Example If F is a field then dim(F”) = n for every positive integer n, since the 
canonical basis of F” has n elements. Similarly, if k and n are positive integers then 
dime (Mxzxn(F)) = kn, since the canonical basis of Mxyn(F) has kn elements. 
The dimension of the space F[X] is infinite since the canonical basis of F[X] has 
infinitely-many elements. 


Example If F is a field and n is a positive integer, then the set W of all polynomials 
in F[X] having degree at most n is a subspace of F[X] having dimension n + 1, 
since {1, X,..., X”} is a basis of W having n + 1 elements. 


72 5 Linear Independence and Dimension 


Example Let V be a vector space over R, Then Y = V? is a vector space over R, 
but it also has the structure of a vector space over C with the same addition and 
avı — bv ; ; 
. This space is 
byy + | P 
called the complexification of V . If B is a basis for V over R then it is easy to check 


at) 


generated over R then Y is finitely generated over C and dimg (V) = dimc(Y). 


with scalar multiplication given by (a + bi) 3 | = | 
2 


v € B} is a basis for Y over C. Thus, in particular, if V is finitely 


With kind permission of UC Berkeley. 


Complexification of real vector spaces was first used extensively by 
the twentieth-century American mathematician Angus Taylor. 


Example The dimension of R over itself is 1. Since {1,7} is a basis of C as a vector 
space over R, we see that dimg (C) = 2 and so there cannot be a proper subfield F 
of C properly containing R. Indeed, if there were such a field, its dimension over R 
would have to be greater than 1 and less than 2 (else it would be equal to C), which 
is impossible. Clearly, dimg (H) = 4. It turns out that the only possible dimensions 
of division R-algebras are 1, 2, 4, and 8. The dimension 8 case is realized by a (non- 
associative) Cayley algebra over R, as defined in Chap. 15. There are no associative 
division algebras of dimension 8 over R. 


With kind permission of the 
Archives of the Mathematisches 
Forschungsinstitut Oberwolfach. 
The twentieth-century Ger- 
man mathematician Heinz 
Hopf used algebraic topology 
to prove that the only pos- 

i sible dimensions of division 
R-algebras were powers of 2, and the final result was obtained by the twentieth-century 
American mathematician Raoul Bott and contemporary American mathematician John 
Milnor, again using non-algebraic tools. 


Example Let F be a field, let (K, e) be an associative unital F-algebra, and let 
v € K. If p(X) =} a:X! € FLX], then p(v) = Py ajv! is an element of K 
and the set of all elements of K of this form is an F-subalgebra of K , which is in fact 
commutative, even though K itself may not be. We will denote this algebra by F[v]. 
If the dimension of F'[v], considered as a vector space over F, is finite, we know that 
there must exist a polynomial p(X) € F[X] of positive degree satisfying p(v) = 

In that case, we say that v is algebraic over F. Otherwise, if the dimension of F[v] 
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is infinite, we say v is transcendental over F. Thus, for example, the real numbers 
x and e (the base of the natural logarithms) are transcendental over Q. If F is a 
subfield of a field K then the set L of all elements of K which are algebraic over F 
is a subfield of K. Moreover, if K is algebraically closed, so is L, and in fact L is 
the smallest algebraically-closed subfield of K containing F. In particular, we can 
consider the field of all complex numbers algebraic over Q. This is a proper subfield 
of C, known as the field of algebraic numbers. 


With kind permission of the Archives of the Mathematis- 
ches Forschungsinstitut Oberwolfach. 

The transcendence of x was proven by German 
mathematician Ferdinand von Lindemann in 
1882. The transcendence of e was proven by 
French mathematician Charles Hermite in 1873. 
As we shall see later, Hermite made many impor- 
tant contributions to linear algebra. 


From the definition of dimension we see that if V is a vector space of finite 
dimension n over a field F then: 
(1) Every subset of V having more than n elements must be linearly dependent; 
(2) There exists a linearly-independent subset B of V having precisely n elements; 
(3) If B is as in (2) then B is also a generating set of V over F. 


Proposition 5.11 Let V be a vector space finitely generated over a field F 
and let W be a subspace of V . Then: 

(1) W is finitely generated over F; 

(2) Every basis of W can be extended to a basis of V; 

(3) dim(W) < dim(V), with equality when and only when W = V. 


Proof Let n = dim(V). 

(1) If W is not finitely generated, then, as we remarked after Proposition 5.3, 
W has a linearly-independent subset B having n + 1 elements. But B is also a 
subset of V, contradicting the assumption that dim(V) = n. 

(2) Let B be a basis of W. Then B is as linearly-independent set of elements of 
V and so, by Proposition 5.7, can be extended to a basis of V. 

(3) By (2), we see that the number of elements of a basis of W can be no greater 
than the number of elements of a basis of V, and so dim(W) < dim(V). Moreover, if 
we have equality then any basis B of W is also a basis of V , and so W = FB = V. 


We now want to extend the notion of linear independence. Let U and W be sub- 
spaces of a vector space V over a field F. Any vector v € U + W can be written in 
the form u + w, where u € U and w € W, but there is no reason for this represen- 
tation to be unique. It will be unique, however, if U and W are disjoint. Indeed, if 
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this condition holds and if u, u’ € U and w, w’ € W satisfy u + w = u’ + w’, then 
u —u' =w — w eU NW andso u — u’ = 0y = w — w', which in turn implies that 
u = u' and w = w’. To emphasize the importance of this situation, we will introduce 
new notation: if U and W are disjoint subspaces of a vector space V over a field F, 
we will write U © W instead of U + W. The subspace U @ W is called the direct 
sum of U and W . We note that, by this definition, U ® {Oy} = U for every subspace 
U of V. 


Example Itis easy to see that R? = R H R pi 


Of course, we would like to extend the notion of direct sum to cover more than 
two subspaces. In general, if V is a vector space over a field F, then a collection 
{Wn | h € 2} of subspaces of V is independent if and only if it satisfies the following 
condition: If A is a finite subset of 2 and if we choose elements wy, € Wp for all 
he A, then et wp = Oy when and only when wy, = Oy for each h € A. Thus 
we see that an infinite collection of subspaces is independent if and only if every 
finite nonempty subcollection is independent. Clearly, a subset D of a vector space 
V over a field F is linearly independent if and only if the collection of subspaces 
{Fv | v € D} is independent. 


Proposition 5.12 Let V be a vector space over a field F and let W\,..., Wn 

be distinct subspaces of V . Then the following conditions are equivalent: 

(1) {W1, ..., Wn} is independent; 

(2) Every vector w € Sa W; can be written as wi +- -+ wn, with w; € Wi 
for each 1 <i <n, in exactly one way; 

(3) Wn and Žizn W; are disjoint, for each 1 <h <n. 


Proof (1) = (2): Let w € }`;_; W; and assume that we can write w = w1 +--+ 
Wn = y1 +- -+ yn, where w;, yi € Wi foreach 1 <i < n. Then )>7_,(w; — yi) = 0y 
and so, by (i), it follows that w; — y; = Oy for each 1 < i < n, proving (2). 

(2) > (3): Assume that Oy 4 wp € Wp N ith W;. Then for each i Æ h there 
exists an element w; € W; satisfying wp = Žizn w;, contradicting (2). 

(3) > (1): Suppose we can write w1 +-+- + Wn = Oy, where w; € W; for each 
1 <i <n, and where w, Æ Oy for some A. Then w, = — Žizn wi E€ Wi isn Wi, 
and this contradicts (3). Thus (1) must hold. 


If V is a vector space over a field F and if {W; | i € 2} is an independent collec- 
tion of subspaces of V, we write Bo Wi instead of } ico Wi. If 2 = {1,...,n}, 
we will also write this sum as W1 ®--- @ Wn. If V = ica W;, then we say that V 
has a direct-sum decomposition relative to the subspaces W;. 


Example If B is a basis of a vector space V over a field F then V = @,eg Fv. 
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The importance of direct-sum decompositions is illustrated by the following re- 
sult. 


Proposition 5.13 Let V be a vector space over a field F, let {Wj | i € Q} be 
a pairwise disjoint collection of subspaces of V and, for each i € 2, let B; be 
a basis of Wi. Then V = @; caw, if and only if B = eq Bi is a basis of V. 


Proof Assume V = @jeq Wi and let v € V. Then there exists a finite subset A 
of 2 such that v € Bier W;, and so for each i € A there is an element w; € W; 
satisfying v = X; <A Wi. Moreover, each w; is a linear combination of elements 
of Bi. Thus v is a linear combination of elements of B, and so B is a generating 
set for V. We are left to show that B is linearly independent. If this is not the case, 
then there exist an element h of 2, vectors y1, ..., y; in Bp, and scalars a1, ..., ar 
in F, not all of which equal to 0, such that Xia ajvj + u = 0y, where u is a 
linear combination of elements of isn B;. But then ae) ajuj EWAN Žizn W;, 
contradicting our initial assumption. Thus B = Uj;<¢ Bi- 

Conversely, if B = Ujeg Bi, it then follows that every element of V can be 
written in a unique way as X; <A Wi, Where A is some finite subset of 2, which 
suffices to prove that V = Qjcg Wi- 


Let W be a subspace of a vector space V over a field F. A subspace Y of V isa 
complement of W in V if and only if V = W @ Y. We immediately note that if Y is 
a complement of W in V then W is a complement of Y in V. In general, a subspace 
of a vector space can have many complements. 


Example Each of the following subspaces of R? is a complement of each of the 
others in R?: 


Proposition 5.14 Every subspace W of a vector space V over a field F has 
at least one complement in V. 


Proof If W is improper, then {Oy} is a complement of W in V. Similarly, V is a 
complement of {Oy} in V. Otherwise, let B be a basis of W. By Proposition 5.8, we 
know that there exists a linearly-independent subset D of V such that B U D isa 
basis of V. Then FD is a complement of W in V. 
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Example Let F be a field of characteristic other than 2, let n be a positive integer, 
and let V be a vector space over F. Let W = Mnxn(V), which is also a vector 
space over F. Let W; be the set of all those matrices A = [v;;] in W satisfying 
vij = vji for all 1 <i, j < n, and let W2 be the set of all those matrices A = [v;;] 
in W satisfying vj; = —vj; for all 1 <i, j < n. These two subspaces are disjoint. If 
A = [v;;] is an arbitrary matrix in W, then we can write A = B +C, where B = [);;] 
is the matrix defined by y;; = 5 (vi + vji) for all 1 <i, j <n, and C = [zij] is the 
matrix defined by zij = 5 (vij — vj;) for all 1 <i, j < n. Note that A € W; and 
B e W2. Thus V = W1 @ W2. 


Example A function f € R® is even if and only if f(a) = f (—a) for all a € R; it 
is odd if and only if f (a) = — f (—a) for all a € R. The set W of all even functions 
is clearly a subspace of RÈ, as is the set Y of all odd functions, and these two 
subspaces are disjoint. Moreover, if f € RÈ then f = fı + fo, where the function 
fiixre sf (x) + f(—x)] is in W and the function fo : x bh SL f(x) — f(—x)] is 
in Y. Thus Y is a complement of W in RÈ. 


Proposition 5.15 Let F be a field which is not finite and let V be a vector 
space over F having dimension at least 2. Then every proper nontrivial sub- 
space W of V has infinitely-many complements in V. 


Proof By Proposition 5.14, we know that W has at least one complement U in V. 
Choose a basis B for U. If Oy 4 w e W, then by Proposition 3.2(9) and the fact 
that F is infinite, we know that Fw is an infinite subset of W. Thus we know that 
the set W is infinite. For each w € W, let Yy = F{u + w | u € B}. We claim that 
each of these spaces is a complement of W in V. Indeed, assume that v € W N Yy. 
Then there exist elements u1,..., un of B and scalars c1,...,Cn in F satisfying 
v = >, ci(ui + w). But then D7) ciui = v — OO ci)w € WOU = {0y} 
and since the set {u1, ..., Uun} is linearly independent, we see that c; = 0 for all i. 
This shows that v = Oy, and we have thus shown that W and Y„ are disjoint. If 
v is an arbitrary element of V, let us write v = x + ey cju;), where x € W, 
the vectors u1,..., Un belong to B, and the scalars cj,...,c, belong to F. Then 
v =[x — Oo ci)w] +}; ci(ui + w) € W + Yuy and thus we have shown that 
V = W + Y„ and so Y, is a complement of W in V. 

We are left to show that all of these complements are indeed different from each 
other. Indeed, assume that w Æ x are elements of W satisfying Y, = Yy. If u € B 
then there exist elements u1, ..., Uun of B and scalars c1, ..., Cn such that u + w = 
X; ci (ui + x). From this it follows that u — }`>;_; ciui = (È; ci)x — w and 
this belongs to W N Y„ = {Ov}. But B is a linearly-independent set and so u has to 
equal to one of the up for some 1 < h < n, and we must have c; = 0 for i # h and 
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Ch = 1. Hence x — w = Oy, namely x = w. This is a contradiction, and so the Y,, 
must all be distinct. 


Proposition 5.16 (Grassmann’s Theorem) Let V be a vector space over a 
field F and let W and Y be subspaces of V satisfying the condition that W +Y 
has finite dimension. Then dim(W + Y) = dim(W) + dim(Y) — dim(W N Y). 


Proof Let Uo = W N Y, which is a subspace both of W and of Y. In particular, Uo 
has a complement U; in W and a complement U2 in Y. Then W + Y = Uo + Ui + 
U2. We claim that in fact W + Y = Up ® U1 @ U2. Indeed, assume that uo + u1 + 
u2 = Oy, where u; € U; for j = 0, 1,2. Then uj = —u2 — uo € WN Y = Uo. But 
Up and U; are disjoint and so u; = Oy. Therefore, uo = —u2 € Up N U2 = {0y}. 
Therefore, uo = Oy and u2 = Oy as well. Thus we see that the set {Up, U1, U2} is 
independent. Therefore, from the definition of the complement, we have 


dim(W + Y) = dim(Uo) + dim(U1) + dim(U2) = dim(W) + dim(U2) 


and this equals dim(W) + dim(Y) — dim(W N Y) since Y= U2 B (W A Y). 


Example Consider the subspaces 


1 1 1 0 
Wi =R 0;,| 2 and W2=R 1],/ 1 
2 2 0 1 


of R?. Each one of these subspaces has dimension 2, and so we see that 2 < 
dim(W, + W2) < 3. By Proposition 5.16, we see that, as a result of this, we have 
1 < dim(W, N W2) < 2. In order to ascertain the exact dimension of W; N W2, 
we must find a basis for it. If v € W; N W: then there exist scalars a,b,c,d sat- 


1 1 1 0 
isfyinga]O0)/+b]2]=c}]1|]+d|1],andsoa+b=c, 2b=c+d, and 
2 2 0 1 
2a + 2b = d, from which we conclude that b = —3a, c = —2a, and d = —4a. Thus 
1 0 —2 
v has to be of the form (—2a) | 1 | + (—4a) | 1 | =a | —6 |, which shows that 
0 1 —4 
—2 
WiN W=R | —6 |, and so it has dimension 1. 
—4 


Very often, we can reduce our computations by passing to complements. A good 
example of this is given by the following proposition. 
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Proposition 5.17 Let V be a vector space over a field F and let W be a 
subspace of V having a complement Y in V. Let {v\,...,Un} be a subset 
of V and, for each 1 <i <n, let vi = wi + yi, where wi € W and y; E Y. 
If the vectors w1,...,W, are distinct and the set {w,,..., Wn} is linearly 
independent, then so is the set {v,..., Un}. 
Proof Assume that there exist scalars a1,..., a, satisfying ye aivi = Oy. Then 
re wi +}; ai yi = Ov, and so $ `;_; aiw; = )-/_, ai yi = Ov. Since the vec- 
tors w1, ..., Wy, are distinct and {w1,..., Wn} is linearly independent, we must have 
aj =- = an = 0, and so {v, ..., Vn} is linearly independent as well. 
Exercises 


Exercise 158 

Let vı, v2, and v3 be distinct elements of a vector space V over a field 
F and let cj,c2,c3 € F. Under what conditions is the subset {c2v3 — c302, 
C1 V2 — C201, C3, — C1 v3} of V linearly dependent? 


Exercise 159 
For which values of the real number t is the subset 


cos(t) + i sin(t) 1 
1 >) cos(t) — i sin(t) 


of C? linearly dependent? 


Exercise 160 

Let F be a field and let V be the subspace of F[X] consisting of all those 
polynomials of degree at most 4. Let p1(X),..., ps(X) be distinct polynomi- 
als in V satisfying the condition that p;(0) = 1 for each 1 <i <5. Is the set 
{pi(X),..., ps(X)} necessarily linearly dependent? 


Exercise 161 
Consider the functions f : x +> 5* and g : x > 5%. Is {f,g} a linearly- 


dependent subset of RÈ? 


Exercise 162 


N 
a 


Find a, b € Q such that the subset a—b|,|b of Q? is linearly depen- 


dent. 
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Exercise 163 


4 1 1 
Let F = Q. Is the subset 2),)0],)3 of F? linearly independent? 
1 0 4 


What happens if F = GF(5)? 


Exercise 164 

Let V be a vector space over a field F and let n > 1 be an integer. Let Y be the 
vi 

set of all vectors | : | € V” satisfying the condition that the set {v1, ... , Vn} is 
Un 

linearly dependent. Is Y necessarily a subspace of V”? 


Exercise 165 
1+i 1-i 1+i 
Is the subset 3+8i |, 5 ,| 3427 of C? linearly independent 
5+ 7i 2+i 4-i 
when we consider C? as a vector space over C? Is it linearly independent when 
we consider C? as a vector space over R? 


Exercise 166 
For each nonnegative integer n, let f, € RË be the function defined by fy : x > 
sin” (x). Is the subset { fn | n > 0} of RÈ linearly independent? 


Exercise 167 
Let V = C(—1, 1), which is a vector space over R. Let f, g € V be the functions 
defined by f : x > x? and g : x + |x|x. Is {f, g} linearly independent? 


Exercise 168 
Let V be a vector space over GF(5) and let vı, v2,v3 € V. Is the subset 
{vy + v2, vy — v2 + V3, 2v2 + V3, V2 + v3} of V linearly independent? 


Exercise 169 

Let F be a field of characteristic different from 2 and let V be a vector space 
over F containing a linearly-independent subset {v1, v2, v3}. Show that the set 
{v1 + v2, v2 + V3, vı + v3} is also linearly independent. 


Exercise 170 


Is the subset of GF(3)* linearly independent? 


NNR rR 
areas 
aera 
=e 
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Exercise 171 


di1 

Let t < n be positive integers and, for all 1 <i < t, let v; = : be a vector in 
ain 

IR" chosen so that 2|a;;| > a= |a;;| for all 1 < j <n. Show that {v,,..., v} 


is linearly independent. 


Exercise 172 
If {v1, v2, v3, v4} is a linearly-independent subset of a vector space V over the 
field Q, is the set 


{3v1 + 2v2 + v3 + v4, 2v1 + 5v2, 3v3 + 2v4, 3v1 + 402 + 203 + 3v4} 
linearly independent as well? 


Exercise 173 

Let A be a subset of R having at least three elements and let fi, f2, f3 € R4 
be the functions defined by f; : x > x'~!2*—!. Is the set {f,, fo, f3} linearly 
independent? 


Exercise 174 

Let F = GF(5) and let V = F”, which is a vector space over F. Let f : x > x? 
and g : x +> x° be elements of V. Find an element h of V such that { f, g, h} is 
linearly independent. 


Exercise 175 
Consider R as a vector space over Q. Is the subset {(a — 2)~! | a € Q} of this 
space linearly independent? 


Exercise 176 

In the vector space V = R® over R, consider the functions fiixp In((x? + 
Patt), forxt In(/x2+D, and fz: x + In(x+ +7). Is the subset 
{ fi, fo, fa} of V linearly independent? 


Exercise 177 


Show that the subset 2/,} 1],] 0 of GF(p)? is linearly independent 


© 
= 


if and only if p 43. 


Exercise 178 

Let F be a subfield of a field K and let n be a positive integer. Show that a 
nonempty linearly-independent subset D of F” remains linearly independent 
when considered as a subset of K”. 
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Exercise 179 
Let F = GF(5) and let V = FF. For 4<k <7, let fk € V be defined by fk: 
a > ak Is the subset { fk | 4 < k < 7} of V linearly independent? 


Exercise 180 

Let V be a vector space over R. For vectors v Æ w in V, let K (v, w) be the 
set of all vectors in V of the form (1 — a)v + aw, where 0 < a < 1. Given 
vectors v, w, y € V satisfying the condition that the set {w — v, y — v} is linearly 
independent (and so, in particular, its elements are distinct), show that the set 


1 1 1 
K(», ae + ») N K(w, aw + ») fal K(». a + w) 
is nonempty, and determine how many elements it can have. 


Exercise 181 

Let V be a vector space finitely generated over a field F and let B = {v1, ..., Un} 
be a basis for V. Let y € V \ B. Show that the set {v1, ..., Vn, y} has a unique 
minimal linearly-dependent subset. 


Exercise 182 
Find all of the minimal linearly-dependent subsets of the subset 


to) Ls} Lo) Col ED 


Exercise 183 

Let V be a vector space over a field F and let D and D’ be distinct finite minimal 
linearly-dependent subsets of V which are not disjoint. If v € D N D’, show that 
(DU D’) Ț {v} is linearly dependent. 


Exercise 184 

Let F be a field of characteristic other than 2. Let V be the subspace of F[X] 
consisting of all polynomials of degree at most 3. Is {X + 2, X? + 1, X? + X?, 
X? — X*)} a basis of V? 


Exercise 185 
Let {v1,..., Un} be a basis for a vector space V over a field F. Is the set 
{v1 + v2, v2 + 03, ..., Un—1 + Vn, Un + U1} Necessarily also a basis for V over F? 


Exercise 186 
Is {1+ 2/5, — 3 + V5} a basis for Q(/5) as a vector space over Q? 


5 Linear Independence and Dimension 


Exercise 187 
For which values of a € R is the set 


a 2a 1 2 1 2a 1 a+l 
2 3a|’}2a 3| |a+1 a+2|’°|2 2a4+1 
a basis for M2x2(R) as a vector space over R? 


Exercise 188 
Let F be an algebraically-closed field and let (K, e) be an associative F-algebra 
having a basis {v1, v2} as a vector space over F. Show that v2 = v2 Or v5 = Ox. 


Exercise 189 

Let V be a vector space over a field F. A nonempty subset U of V is nearly 
linearly independent if and only if U is linearly dependent but U x {u} is lin- 
early independent for every u € U. Find an example of a set of three vectors 
in R? which is nearly linearly independent. Does there exist a nearly linearly 
independent subset of R? having four elements? 


Exercise 190 
Find a basis for the subspace W of R* generated by 


4 1 1 1 
2 -1 2 5 
6 |’ 3}°>}0]°} -3 
—2 -1 0 1 


Exercise 191 
For each real number a, let fa € RË be defined by 


1 ifr=a, 
0 otherwise. 


faire | 


Is {fa |a € R} a basis for RE over R? 


Exercise 192 

Let A be a nonempty finite set and let V be the collection of all subsets of A, 
which is a vector space over GF(2). For each a € A, let vg = {a}. Is {vg |a € A} 
a basis for V? 


Exercise 193 
1 0 0 1 0 —i 1 0 ; ; 
Show tnat {| eG er alas 2 | is a pasis for he vector 


space M2 2(C) over C. (The last three of these matrices are known as the Pauli 
matrices and play a very important part in the formulation of quantum physics. 
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Exercise 194 
Let F be a field and let a, b, c € F. Determine whether 


is a basis for F°. 


Exercise 195 
Let V be a vector space finitely generated over a field F having a basis 
{u,,..., Un}. Is {v1, pa Uj, +++» 2—4 Vi} necessarily a basis for V? 


Exercise 196 

Let F = GF(p) for some prime integer p, let n be a positive integer, and let V be 
a vector space of dimension n over F. In how many ways can we choose a basis 
for V? 


Exercise 197 
Let V be a three-dimensional vector space over a field F, with basis {v 1, v2, v3}. 
Is {vy + v2, v2 + v3, vı — v3} a basis for V? 


Exercise 198 


Let V be a vector space of finite dimension n over C having basis {v1,..., Un}. 
Show that {v1,..., Un, iU1,...,iU,} is a basis for V, considered as a vector space 
over R. 


Exercise 199 
Let V be a vector space of finite dimension n > 0 over R and, for each positive 
integer i, let U; be a proper subspace of V. Show that V A |J; Ui. 


Exercise 200 

Let V be a vector space over a field F which is not finite dimensional, and 
let W be a proper subspace of V. Show that there exists an infinite collection 
{Y1, Yo,...} of subspaces of V satisfying ()72, Y; C W but ();_, Y; É W for all 
n>1. 


Exercise 201 

Let V be the subspace of R[X] consisting of all polynomials of degree at most 5, 
and let A = {X° + X4, X° — 7X3, X’ — 1, X° + 3X}. Show that this subset of V 
is linearly independent and extend it to a basis of V. 


Exercise 202 

Let V be a vector space of finite dimension n over a field F, and let W be a 
subspace of V of dimension n — 1. If U is a subspace of V not contained in W, 
show that dim(W N U) = dim(U) — 1. 
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Exercise 203 

Let a,b, c,d be rational numbers such that {a + cV3,b + dv/3} is a basis for 
Q(V3) as a vector space over Q. Is {c + av/3,d + bV3} a basis for Q(/3) as a 
vector space over Q? Is {a + cV/5, b+ d4/5} a basis for Q(/5) as a vector space 
over Q? 


Exercise 204 
Find a real number a such that 


—9 2 1 3 —1 
a —5 4 —1 9 
dim | R -1 |, 3],] -l], 2 |,| —4 =2 
—5 0 1 1 1 
—14 2 2 4 (0) 
Exercise 205 
2 1 —1 
1 2 1 4 : . : 
Let W =R 3 elol C R*. Determine the dimension of W and 
1 1 0 
find a basis for it. 
Exercise 206 
Consider the vectors 
(0) 7 (0) 1 (0) 
1 4 3 9 1 
vy=10O}], vw=|1ļ], wy=]0], vw=|5ļ|, and v=]ļ|0 
1 8 4 7 5 
0 3 0 1 0 


in the vector space Q5. Do there exist rational numbers a;j, for 1 < i, j < 5, such 
that the subset i= ,ajvj |1<i<5}of Q5 is linearly independent? 


Exercise 207 

Let F be a subfield of a field K satisfying the condition that K is finitely gener- 
ated as a vector space over F. For each c € K, show that there exists a nonzero 
polynomial p(X) € F[X] satisfying p(c) = 0. 


Exercise 208 
1 —1 —5 
_ 2 1 4 _ —1 6 
Let W = R HE 1 cC R* and let V = R ol 3 . Com- 
0 1 1 0 


pute dim(W + V) and dim(W N V). 
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Exercise 209 

Let F be a subfield of a field K satisfying the condition that the dimension of K 
as a vector space over F is finite and equal to r. Let V be a vector space of finite 
dimension n > 0 over K. Find the dimension of V as a vector space over F. 


Exercise 210 

Let V be a vector space over a field F having infinite dimension over F. Show 
that there exists an infinite sequence W1, W2,... of proper subspaces of V, satis- 
fying U Wi =V. 


Exercise 211 
Let F = GF(p), where p is a prime integer, and let V be a vector space over F 
having finite dimension n. How many subspaces of dimension | does V have? 


Exercise 212 

Let W be the subset of RÈ consisting of all functions of the form x bb 
a-cos(x — b), for real numbers a and b. Show that W is a subspace of RË and 
find its dimension. 


Exercise 213 


4 6 1 4 1 

_ 3 2 1 4 = 2 0 4 

Let W =R atelololy C R| and let Y = R 01°13 CR’. 
1 2 2 —2 2 


Find dim(W + Y) and dim(W N Y). 


Exercise 214 
Let V be a vector space of finite dimension n over a field F and let W and Y be 
distinct subspaces of V, each of dimension n — 1. What is dim(W N Y)? 


Exercise 215 
Let V be a finite-dimensional vector space over a field F and let B be a basis of 


V such that {| | wE B} is a basis for V?. What is the dimension of V? 


Exercise 216 
Let F be a field and let V be the subspace of F[X] consisting of all polynomials 
of degree at most 4. Find a complement for V in F[X]. 


Exercise 217 
Let F be a field and let V be the subspace of F[X] consisting of all polynomials 
of the form (X? + X + 1)p(X) for some p(X) € F[X]. Find a complement for 
V in F[X]. 
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Exercise 218 

Let B be a nonempty proper subset of a set A. Let F be a field and let V = F4. 
Let W be the subspace of V consisting of all those functions f € V satisfying 
f(b) = 0 for all b € B. Find a complement of W in V. 


Exercise 219 
Let F be a field of characteristic other than 2, let V be a vector space over F, and 


v v 

let U = v’ v,v' eV} cV? IsY= v v € V ù acomplement 
v+ v 

of U in V? 

Exercise 220 


Let F be a field and let p(X) € F[X] have positive degree k. Let W be the 
subspace of F[X] composed of all polynomials of the form p(X)g(X) for some 
g(X) € F[X]. Show that W has a complement in FX] of dimension k. 


Exercise 221 

Let V be a vector space over a field F which is not finite dimensional, and let 
VD W, D W2>.--- be a chain of subspaces of V, each properly contained in 
the one before it. Is the subspace (72, W; of V necessarily finite-dimensional? 


Exercise 222 

Let V be a vector space finitely generated over a field F. Let W and Y be 
subspaces of V and assume that there is a function f € FV satisfying the 
condition that f(w) < f(y) for all Oy 4 w € W and Oy Æ y € Y. Show that 
dim(W) + dim(Y) < dim(V). 


Exercise 223 

Let (K, e) be a division algebra of dimension 2 over R containing an element 
vı which satisfies the condition that vı ev = v = v è v; forall v € V. Show that 
(K, +, e) is a field. 


Exercise 224 

For each a € R, the set Q[a] = {p(a) | p(X) € Q[X]} is a subspace of R, con- 
sidered as a vector space over Q. Find all pairs (a,b) of real numbers a 4 b 
satisfying the condition that the set {Q[a], Q[b]} is independent over Q. 


Exercise 225 

Let V be a vector space over a field F. Find a necessary and sufficient condition 
for there to exist subspaces W and W’ of V such that {{0y}, W, W’} is indepen- 
dent. 


Exercise 226 
Let (K, e) be a unital R-algebra (not necessarily associative) with multiplicative 
identity e, and let {v; | i € 2} be a basis for K over R containing e (which is 
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equal to v; for some t € 2). If v = J jco civi E€ K, set U = ctv, — Ls Cj Uj. 
If v € K, is it true that ve U = U è v and T = v? (Note that this construction 
generalizes the notion of the conjugate of a complex number.) 


Exercise 227 

For each nonnegative integer n, define the subsets P,, An, and F, of R as fol- 

lows: 

(1) Po = Ø, Ao = {1}, and Fp = Q; 

(2) Ifn > 0, then P, is the set of the first n prime integers, A, consists of 1 and 
the set of square roots of products of distinct elements of P,, and F, = QAn. 

Show that each A, is a linearly-independent subset of R, considered as a vector 

space over Q, and that F, is a subfield of R, having the property that every 

element of F, the square of which belongs to Q must belong to Qa, for some 

aé Ay. 


Exercise 228 

Find all a € R (if any exist) satisfying the condition that the dimension of 
-1 2 1 

R 2a |, 2) 4) 1 is at most 2. 
—2 -1 0 


Exercise 229 

Give an example of a vector space V finitely generated over a field F, together 
with nonempty subsets B1, B2, and B3 of V satisfying the following conditions: 
(1) Each B; is linearly independent; 

(2) For each 1 <i Æ j < 3 there exists a basis of V containing B; U Bj; 

(3) There is no basis of V containing Bı U B2 U B3. 


Exercise 230 

Let V be a vector space over R. A fuzzification of V is a function from V to 
the unit interval I of real numbers, satisfying the condition that w(av + bw) > 
min{u(v), u(w)} for all a,b € R and all v, w € V. A finite nonempty linearly- 
independent subset {v),..., Un} of V is -linearly independent if and only if it 
satisfies the additional condition that uO, ajv;) = min{ajv],..., An Up}. 

(1) Show that the function u : R? + I defined by 


1 ifa=b=0, 
u HG 1 ifa=0andb#0, 
i otherwise 


is a fuzzification on V. 


(2) Is the linearly-independent subset | H ; E 


| | of R also u-linearly inde- 


pendent? 
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Exercise 231 

Let V be a vector space over a field F and let B be a fixed basis of V. We then 
know that each element v € V can be written in a unique way as v = } peg CwW, 
where the cy, are scalars, only finitely-many of which are nonzero. Let n(v) be 
the number of nonzero scalars Cw in this representation. (Note that n(v) = 0 if 
and only if v = Oy.) Define a relation < on V by setting vı < v2 if and only if 
n(v1) <n(v2). Is this a partial order relation on V? 


Exercise 232 
Let V be a vector space over a field F and let D be a finite minimal linearly- 
dependent subset of V. Find dim(F D). 


Linear Transformations 


Let V and W be vector spaces over a field F. A function a: V > W is a linear 
transformation or homomorphism if and only if for all vı, v2 € V and a € F we 
have a(vy + v2) = a(v1) + a(v2) and a(av,) = aa (vı). We note that, as a con- 
sequence of the second condition, we have a(Oy) = a(O0y) = 0a (0y) = Ow. If 
(K,e) and (L, *) are F-algebras, then a linear transformation a: K > L is a ho- 
momorphism of F-algebras if it is a linear transformation and, in addition, satisfies 
a(vy è v2) = a (v1) *a(v2) for all vy, v2 € K. If both K and L are unital, then it is a 
homomorphism of unital F -algebras if it also sends the identity element of K for e 
to the identity element of L for x. 


With kind permission of the Archives of the Mathematisches Forschungsinstitut Ober- 
wolfach. 

Linear transformations between finite-dimensional vector spaces were 
studied by Peano. Linear transformations between infinite-dimensional 
spaces were first considered in the late nineteenth century by Italian 
mathematician Salvadore Pincherle. 


Example Let V be a vector space over a field F. Every scalar c € F defines a linear 
transformation oc : V —> V given by oe : v > cv. In particular, o; is the identity 
function v +> v and oo is the 0-function v +> Oy. 


Example Let F be a field and let aj,...,a6 be scalars in F. The function 
ajc, + a2c2 
a: F? + F? defined bya: Bs > | a3cı +a4c2 | is a linear transformation. 
2 a5C\ + a6C2 


The previous example can be generalized in an extremely significant manner. Let 
k and n be positive integers and let F be a field. Every matrix A = [a;j] E€ Mxxn(F) 


J.S. Golan, The Linear Algebra a Beginning Graduate Student Ought to Know, 89 
DOI 10.1007/978-94-007-2636-9_6, © Springer Science+Business Media B.V. 2012 
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defines a linear transformation from F” to F* given by 


c1 a11C1 +++ + ainCn 

c2 a21C1 +++++ a2nCn 
H>: ; 

Cn AKC] +++- + aknCn 


In what follows, we will show that every linear transformation from F” to F* can 
be defined in this manner. 


Example Let F be a field of characteristic 0. Then there are linear transformations 
a and f from F[X] to itself defined by 


[0.6] [0,6] [0,6] [0,6] 
~:J aX’ ) ia: x! and B: > ajX'h Sod +i) axit, 
i=0 i=0 i=0 i=0 


(By 1 +i, we mean the sum of 1 +i copies of the identity element for multiplication 
of F; since the characteristic of F is 0, we know that this element is nonzero, and 
so is a unit in F.) 


Example Let V and W be vector spaces over a field F and let k and n be positive 
integers. For all 1 <i < kand 1 < j <n, leta;; : V > W bea linear transformation. 
Then there is a linear transformation from Mkgxn(V) to Mkxn(W) defined by 


VIE >.. Un œi (vi) «en (Vin) 


e 


Uki ++. Ukn Akı (Vki) «+ kn (Ven) 


Example Let V be the subspace over RË consisting of all differentiable functions. 
For each f € V, we define a function Df : R x R —> R, called the differential of f, 
by setting Df : (a,b) f’(@ab, where f’ is the derivative of f. Then the function 
D : V — RP”! given by f > Df is a linear transformation. Such linear transfor- 
mations play an important part in differential geometry. 


Example Sometimes linear transformations between F-algebras which are not ho- 
momorphisms of F-algebras play an important role. Let (K, e) be an associative 
algebra over a field F and let c € F. Then K is a Baxter algebra over F of weight 
c if and only if there exists a linear transformation a: K —> K satisfying the con- 
dition that a(x) ea(y) =a(a(x)e y) +a(xea(y))+ca(xe y) forall x,y € K. 
Thus, for example, if K is the R-algebra of all continuous functions from R to itself, 
the linear transformation a: K — K given by a(f): tbh i f(s) ds defines on K 
the structure of a Baxter algebra of weight 0. If F is any field and if K = F?” with 
componentwise addition and multiplication, then the function a: K — K given by 
a: [a1, a2, ...] > [a1, a1] + a2, a, + a2 + a3, ...] defines on K the structure of a 
Baxter algebra of weight 1. 
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Example Linear transformations are considered nice from an algebraic point of 
view, but may be less so from an analytic point of view. Let B be a Hamel ba- 
sis of R over Q. Then for each real number r there exists a unique finite sub- 
set {ui (r), ..., Ung) (r)} of B and scalars aj(r),...,an(r)(r) in Q satisfying r = 
Be aj(r)u;(r). The function from R to R defined by r > Da a;(r) is a linear 
transformation, but is not continuous at any r € R. 


Let V and W be vector spaces over a field F. To any function f : V > W we 


can associate the subset gr( f) = {| vE v} of V x W, called the graph 


v 
fo) 
of f. We can use the notion of graph to characterize linear transformations in terms 
of subspaces. 


Proposition 6.1 Let V and W be vector spaces over a field F and let 
a:V— W be a function. Then a is a linear transformation if and only if 
gr(a) is a subspace of V x W. 


Proof Assume that a is a linear transformation. If v, v’ € V and c € F then in 


v v’ v+ v +v 
a a] t Ea F ho e 7 paw) =e 


v cu cu 
ane E = ca(v) = a(cv) 
under taking sums and scalar multiples, and so is a subspace of V x W. 
Conversely, if it is such a subspace then for v,v’ € V and c € F we note that 


v v’ v+u 
a + Fa = a | ey! € gr(a), and so we must have 


a(v) +a(v’!) =a(v + v’). Similarly, c FA = le € gr(œ), and so we must 


€ gr(a), showing that gr(œ) is closed 


have ca(v) = a(cv). Thus q@ is a linear transformation. 


Let V and W be vector spaces over a field F. If æ and £ are linear transformations 
from V to W, they are, in particular, functions in Ww’, and so the function a + £ : 
V > W is defined by a+ B: v > a(v) + (v) for all v € V. For all v, v’ € V and 
all c e F, we have 


(w+ Bvt’) = alw +v) +o) 
= a(v) +a(v’) + B(v) +p’) 
= (a + B)(v) + (at B)(v’) 
and (a+ 8)(cv) =a(cv)+ B(cv) = ca(v)+cBh(v) = cla (v)+ B(v)] = c(a+f)(v). 


Thus we see that a + £ is a linear transformation from V to W. If ce F isa 
scalar then the function ca from V to W is defined by ca : v + ca(v) and this, 
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again, is a linear transformation from V to W. It is easy to check that the set of all 
linear transformations from V to W is a subspace of WV , which we will denote by 
Hom(V, W), or Hom (V, W) in case the field needs to be emphasized. 


Example Let V and W be vector spaces over R and let having complexifications 

U and Y, respectively. If a € Homp(V, W) then the function H = len | 
2 2 

belongs to Homc(U, Y). 


Since Hom(V, W) is a vector space, we can apply concepts we have already 
considered for vectors to linear transformations. For example, we can talk about 
a linearly-dependent or linearly-independent set of linear transformations from a 
vector space V over a field F to a vector space W over F. However, we must be 
very careful to remember that when we are doing so, we are working in the space 
Hom(V, W), and not in either V or W. The following example illustrates the pitfalls 
one can encounter. 


Example Let V and W be vector spaces over the same field F. A nonempty sub- 
set D = {a,...,@,} of Hom(V, W) is locally linearly dependent if and only if 
the subset {a1(v),...,@n(v)} of W is linearly dependent for every v € V. If D is 
a linearly-dependent subset of Hom(V, W), then there exist scalars cj,..., Cn, not 
all of which are equal to 0, such that }~"_, cia; is the 0-function. In particular, 
for each v € V we see that )~/_, cjaj(v) = Ow and so D is locally linearly de- 
pendent. The converse, however, is false. It may be possible for D to be linearly 
independent and still locally linearly dependent. To see this, take V = W = F? 


and let D = {a,,a2} C Hom(F?, F”), where we define œ : H a B and 


a2: H =e p If v € F?, then {a1(v), @2(v)} is a subset of the one-dimensional 


subspace F 7 of F? and so cannot be linearly independent. On the other hand, D 


is linearly independent since if there exist scalars c and d satisfying the condition 
that ca; + daz is the 0-function, then 


[l= (lo) +(e] el-s] 


which implies that c = 0. Similarly, 


o eae 


The following proposition shows that the operation of a linear transformation is 
entirely determined by its action on elements of a basis. This result is extremely 
important, especially if the vector spaces involved are finitely generated. 
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Proposition 6.2 Let V and W be vector spaces over a field F, and let B 
be a basis of V. If f e W® then there is a unique linear transformation 
a € Hom(V, W) satisfying the condition that a(u) = f (u) forall u € B. 


Proof Since B is a basis of V, we know that each vector v € V can be written as a 
linear combination v = )~?_, aju; of elements of B in a unique way. We now define 
the function a: V > W bya: v > J`; a; f (uj). This function is well defined as 
a result of the uniqueness of representation of v, as was shown in Proposition 5.4. 
Moreover, it is clear that œ is a linear transformation. If 6 : V — W is a linear 
transformation satisfying the condition that B(u) = f (u) for all u € B then B(v) = 
BOC) Gui) = $; aip (ui) = $; ai f (ui) = a (v), and so B = a. Thus « is 


unique. 


Example Let F bea field and let co, c1, ... be a sequence of elements of F. Then we 
have a linear transformation a : F[X]—> F defined by a: )77_)ajX' > 7} py ajci. 


Example We can use Proposition 6.2 to show how uncommon linear transforma- 
tions really are. Let F = GF(3) and let V = F 4 Then V has 34 = 81 elements and 
so the number of functions from V to itself is 8181. On the other hand, a basis B 
for V over F has 4 elements and so, since every linear transformation from V to 
itself is totally determined by its action on B and that any function from B to V de- 
fines such a linear transformation, we see that the number of linear transformations 
from V to itself is 81+. Therefore, the probability that a randomly-selected function 
from V to itself be a linear transformation is 814/818! = 8177, which is roughly 
0.11134 x 107146, 


Proposition 6.3 Let V, W, and Y be vector spaces over a field F and let 
a:V— Wand $: W —> Y be linear transformations. Then Ba: V > Y is 
a linear transformation. 


Proof If vi, v2 € V and if a € F then 
(Ba) (vi + v2) = B(a(v1 + v2)) = (a (v1) + a(v2)) 
= B(a(v1)) + B(a(v2)) = (Ba) (v1) + (Ba)(v2) 


and (Ba)(cv,) = B(a(cv)) = B(ca(v1)) = cf (œ (v1)) = c(a) (v), which proves 
the proposition. 


Example It is often important and insightful to write a linear transformation as a 
composite of linear transformations of predetermined types. Consider the following 
situation: Let a < b be real numbers and let V be the vector space over R consisting 
of all functions from the closed interval [a, b] to R. Let W be the subspace of V 
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consisting of all differentiable functions, and let ô: W —> V be the function which 
assigns to each function f € W its derivative. For each real number a < c < b, let 
€c : V — R be the linear transformation defined by £e : g + g(c). Then the Interme- 
diate Value Theorem from calculus says that the linear transformation 6B: W > R 
defined by 6: f œ> [f (b) — f (a)](b — a)~! is of the form ¢,6 for some c. 


Let V and W be vector spaces over a field F and let a: V —> W be a linear 
transformation. For w € W, we denote {v € V | a(v) = w} by æT! (w). Note that 
this set may be empty. In particular, we will be interested in a~'!(Ow) = {v € V | 
a(v) = Ow}. This set is called the kernel of œ and is denoted by ker(a). Then ker(@) 
is never empty, since it always contains Oy. If U is a nonempty subset of W, set 
a7! (U) = {a—!(w) | u € U}. It is easy to verify that a—'(U) isa subspace of V 
whenever U is a subspace of W. 


Example Let F be a field and let a € Hom(F 3, F*) be the linear transformation 


a—b 
a 0 a 
defined bya: | b | |> i . Then ker(œ) = aļ|aeF 
c 0 
c 


Proposition 6.4 Let V and W be vector spaces over a field F and let 
a € Hom(V, W). Then ker(a) is a subspace of V , which is trivial if and only 
if æ is monic. 


Proof Let v1, v2 € ker(@) and let a € F. Then (vı + v2) = a (v1) + æ (v2) = Ow + 
Ow = Ow, and so vı + v2 € ker(œ). Similarly, æ (avı) = aœ (v1) = a0w = Ow and 
so avı € ker(œ). This proves that ker(«œ) is a subspace of V. 

If æ is monic then w~!(w) can have at most one element for each w € W, and 
so, in particular, ker(~) = {Oy}. Conversely, suppose that ker(q@) is trivial and that 
there exist elements vı 4 v2 of V satisfying a(v1) = a(v2). Then (vı — v2) = 
a(vy) — a(v2) = Ow and so vy — v2 € ker(@). Thus vı — v2 = Oy and so vı = v2, 
which is a contradiction. Hence a must be monic. 


Let V and W be vector spaces over a field and let a: V — W be a linear 
transformation. The image of a is the subset im(a@) = {a(v) | v e€ V} of W. This 
set is nonempty since Ow = a(Ov) € im(@). Note that w € im(q@) if and only if 
aT! (w) Æ Ø. If U is a nonempty subset of V, we denote the subset {æ (u) | u € U} 
of W by a (U). Thus a(V) =im(a@). 


Proposition 6.5 Let V and W be vector spaces over a field F and let 
a € Hom(V, W). Then im(a) is a subspace of W, which is improper if and 
only if a is epic. 
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Proof If æ(vı) and æ(v2) are in im(@) and if a € F, then a(vj) + a(v2) = 
a(v; + v2) € im(@) and similarly aœ (v1) = æ (avı) € im(@), proving that im(q@) is a 
subspace of W. The second part follows immediately from the definition of an epic 
function. 


A monic linear transformation between vector spaces over a field F is called a 
monomorphism; an epic linear transformation between vector spaces is called an 
epimorphism. A bijective linear transformation between vector spaces is called an 
isomorphism. If both spaces are also F-algebras, then a bijective homomorphism 
of F-algebras is called an isomorphism of F-algebras. Similarly, a bijective homo- 
morphism of unital F'-algebras is an isomorphism of unital F -algebras. 


Example Let F be a field and let k and n be positive integers. For each ma- 

trix A = [aij] € Mixn(F), we can define the transpose of A to be the matrix 

Ale Mnxk(F) obtained from A by interchanging its rows and columns. In other 
ail ze Akl 


words, A? = > *. 1 |. It is easy to check that the function A +> AT is an 


Alin +--+ Akn 
isomorphism from Mgxn(F) to Mnxk(F). 


Example Let K and L be F-algebras. It is possible for a linear transformation 
a: K — L to be an isomorphism of vector spaces without being an isomor- 
phism of F-algebras. This is the case, for example, with the linear transformation 


a: Q(V2) > Q5) given by a : a + b2 > a + bv/5. 


Example Let V be a vector space over a field F. Any linear transformation 
a: V — F other than the 0-function is an epimorphism. Indeed, if œ is a nonzero 
linear transformation and if vo € V satisfies the condition that œ (vo) = c Æ 0, then 
for any a € F we have a = (ac™!)c = (ac™!)a (vo) = a ((ac™})vo) € im(a). 


Example Let F be a field and let a: F‘°) —> F[X] be the function defined by 
a: f> 725 f@X', which is well-defined since only finitely-many of the f (i) 
are nonzero. This is easily checked to be an isomorphism of vector spaces. 


We have already seen that if D is a basis of a vector space V over a field F then 
there exists a bijective function 6 : FP? — V, and it is easy to verify that this is in 
fact an isomorphism of vector spaces. This leads us to the very important observa- 
tion that for any nontrivial vector space V over a field F there exists a nonempty set 
Q@ and an isomorphism F“?) > V. 

Let V and W be vector spaces over a field F and let B be a basis of V. Then we 
can define a function g : Hom(V, W) > w? by restriction: (œ) : u œ> a(u) for all 
u € B. It is straightforward to check that ọ is a linear transformation of vector spaces 
over F. Moreover, by Proposition 6.2, we see that any function f € W? is of the 
form g(a) for a unique element a of Hom(V, W). Therefore, g is an isomorphism. 
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Let V and W be vector spaces over a field F. If æ : V — W is a linear transfor- 
mation and Ow + w € W then a—!(w) is not a subspace of V. However, the next 
result shows that, if it is nonempty, it is close to being a subspace. 


Proposition 6.6 Let a: V — W be a linear transformation of vector spaces 
over a field F and let w € im(a). For any vo € a~!(w) we have a~!(w) = 
{v + vo |v €ker(a)}. 


Proof If v € ker(@) then a (v + vo) = æ (v) + a (vo) = Ow + w = w and so v + vo € 
a~'(w). Conversely, if vı € a~!(w) then vı = (vı — vo) + vo, where vı — vo € 
ker(œ) since œ (vı — vo) = a (v1) — a (vo) = w — w = Ow. 


Note that if w Æ Ow then aT! (w) is not a subspace of V but rather the result 
of “shifting” a subspace by adding a fixed nonzero vector to each of its elements. 
Such a subset of a vector space is called an affine subset, or linear variety of a 
vector space. Let V and W be vector spaces over a field F. An affine transformation 
¢ : V —> W is a function of the form v > a(v) + y, for some fixed a € Hom(V, W) 
and y € W. It is clear that the sum of two affine transformations is again an affine 
transformation, as is the product of an affine transformation by a scalar, so that 
the set Aff(V, W) of all affine transformations from V to W is also a subspace 
of WV which in turn contains Hom(V, W) as a subspace. Indeed, Aff(V, W) = 
F(Hom(V, W) U K), where K is the set of all constant functions from V to W. 

Moreover, if ¢ : V — W is the affine transformation defined by v œ> a(v) + y 
and if w € W, then ¿Tl (w) =a! (w— y) and so is an affine subset of V. 

Analysis of computational procedures in linear algebra often hinges on the 
fact that when we think we are computing the effect of some linear transforma- 
tion a € Hom(V, W), we are in fact computing that of an affine transformation 
v > a(v) + y where y is a vector arising from computational or random errors 
which, hopefully, is “very small” (in some sense) relative to æ (v). Similarly, in lin- 
ear models in statistics one must allow for such an affine transformation, where y is 
a random error vector, assumed to have expectation 0. 


Example Let V = C(O, 1) and let W be the subspace of V composed of all dif- 
ferentiable functions having a continuous derivative. Let ô: W — V be the linear 
transformation which assigns to each function f € W its derivative. Then ker(é) 
consists of all constant functions. If g € im(ô) then g = ô( f), where f is the 
function f:xbh ie g(t) dt. Thus 5—!(g) consists of all functions of the form 
fix fj g(t) dt +c, where c €R. 


Proposition 6.7 If a: V — W is an isomorphism of vector spaces over a 
field F then there exists an isomorphism p : W — V satisfying Ba(v) = v 
and aB(w) = w forall v € V andallwe W. 
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Proof Define the function 6 by (w) = v if and only if w = a(v). This function is 
well-defined since every element w is of the form a(v) for a unique element v € V. 
It is easy to check that the function 6 is an isomorphism which satisfies the stated 
conditions. 


The function £ defined in Proposition 6.7 is denoted by a~!. 


Let V and W be vector spaces over a field F. If there exists an isomorphism from 
V to W, we say that V and W are isomorphic and write V = W. It is easy to see 
that if V, W, and Y are vector spaces over F then: 
d) VZV; 
(2) If V=W then W SV; 
(3) If V X W and W Y then V Y. 
It is also clear that if œ : V — W is an isomorphism between vector spaces over 
F and if B is a basis of V then {a(u) | u € B} is a basis of W. As an immediate 
consequence of this, we see that if V = W then the dimensions of V and W are the 
same. The converse is true if V and W are finitely generated, as we shall now see. 


Proposition 6.8 Let V and W be vector spaces over a field F having bases 
B and D, respectively, and assume that there exists a bijective function 
f:B— D.Then VSW. 


Proof By Proposition 6.2, we know that there exists a linear transformation 
a € Hom(V, W) satisfying the condition a(v) = f(v) for all v € B. This lin- 
ear transformation is epic since im(@) contains a basis of W. If v’ = Dg pavY 
(where only finitely-many of the coefficients a, are nonzero) belongs to ker(œ) then 
Ow =a(v') = (X peg GY) = J peg w (v) = } eg Gy f (V) and so a, = 0 for all 
v € B, since D is linearly independent. Therefore, ker(q) is trivial, and this shows 
that œ is monic and hence an isomorphism. 


In particular, if V and W are vector spaces of the same finite dimension n over a 
field F, then V = W. 


Proposition 6.9 If V and W are vector spaces finitely generated over a field 

F, then 

(1) There exists a monomorphism from V to W if and only if dim(V) < 
dim(W); 

(2) There exists an epimorphism from V to W if and only if dim(V) > 
dim(W). 


Proof (1) If there exists a monomorphism a from V to W then V = im(q@) and 
so dim(W) > dim(im(a)) = dim(V). Conversely, assume that dim(V) < dim(W). 
Then there exists a basis B = {v1,..., Un} of V and there exists a basis D = 
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{w1,..., wW} of W, where n < t. The function from B to W given by v; œ> wi 
for all 1 <i < n can be extended to a linear transformation a : V — W, which is 
monic and so is a monomorphism. 

(2) If there exists an epimorphism a from V to W and if {v1, ..., v,} is a basis 
of V, then {æ(v;)| 1 <i < n} is a generating set of W and so the dimension of 
W is at most n = dim(V). Conversely, if n = dim(V) > dim(W) = t, pick a basis 
{w1,..., wp} of W and a basis B = {v1, ..., Vn} of V. Define a function f : B > W 
by 


wi forl <i <t, 
w, fort<i<n. 


fin | 


From Proposition 6.2, it follows that there exists a linear transformation a: V — W 
satisfying a(v;) = f (vi) for all 1 <i <n, and this is the desired epimorphism. 


Proposition 6.10 Let V and W be vector spaces over a field F, where V is 
finitely generated. Then dim(V) = dim(im(a)) + dim(ker(@)) for any linear 
transformation a € Hom(V, W). 


Proof Let a c Hom(V, W). Set V; = ker(@) and let V2 be a complement of Vı 
in V. By Proposition 5.16, we see that dim(V) = dim(V,) + dim(V2) and so it 
suffices for us to show that V2 = im(q@). Let a2 be the restriction of œ to V2. 
Then a2 € Hom(V2, im(q@)). If v2 € ker(a@2) then v2 € V2 N Vi = {Ov}. Thus a2 
is a monomorphism. If w € im(a@) then there exists an element v of V satisfying 
a(v) = w. Moreover, v = vı + v2 for some vı € Vı and v2 € V2 so w = a (v) = 
a (v1) +a(v2) = Ow + a(v2) = a(v2) = a2(v2). Therefore, im(@2) = im(@), show- 
ing that œz is also an epimorphism and hence the desired isomorphism. 


Let V and W be vector spaces over a field F. If a e Hom(V, W) then we define 
the rank rk(a) of œ to be dim(im(q@)) and define is the nullity null(a) of a to be 
dim(ker(@)). Thus, Proposition 6.10 says that V has finite dimension n then both 
the rank and nullity of œ are finite and their sum is n. The converse is also clearly 
true: if the rank and nullity of «œ are both finite, then the dimension of V is finite. Let 
us give bounds on the rank and nullity of compositions of linear transformations. 


Proposition 6.11 (Sylvester’s Theorem) Let V, W, and Y be vector spaces 
finitely-generated over a field F and leta: V — W and B: W — Y be linear 
transformations. Then 

(1) null(Sq@) < null(œ) + null(6); 

(2) rk(œ) + rk(B) — dim(W) < rk(Ba) < min{rk(@), rk(B)}. 
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Proof (1) Let fı be the restriction of 6 to im(a). Then ker(£1) is a subspace of 
ker(8). By Proposition 6.10, we have 


null(Ba) = dim(V) — rk(6a) = [dim(V) — rk(œ)] + [rk(œ) — tk(Bq) | 
= null(@) + null (£1) < null(@) + null (£). 


(2) Clearly, im(8a@) is a subspace of im(£) and so its dimension is no greater than 
that of im(8). Moreover, im(6a@) = im(f1) and so rk(Ba) < rk(œ). Thus rk(Ba) < 
min{rk(q@), rk(8)}. Moreover, from (1) we see that 


dim(V) — null(Ba) > dim(V) — null (æ) + dim(W) — null (£) — dim(W) 
= rk(a) + rk(B) — dim(W), 


and this proves that rk(œ) + rk(6) — dim(W) < rk(Bqa). 
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Exercise 233 

Which of the following statements are true for all vector spaces V and W over a 
field F and all a € Hom(V, W)? 

(1) a(AU B) =a(A) Ua(B) for all nonempty subsets A and B of V; 

(2) a(AN B) =a(A)Na(B) for all nonempty subsets A and B of V; 

(3) a-!(CU D) =a7!(C) U «7! (D) for all nonempty subsets C and D of W; 

(4) a7! (C NA D) =a! (C)Na! (D) for all nonempty subsets C and D of W. 


Exercise 234 


1 -1 
Let œ : R? > R? be a linear transformation satisfying a 0 = 3], 
1 4 
1 0 1 3 1 
a —1 =| 1 |, anda 2 = | 1 |. What is œ 0 ? 
1 0 —1 4 0 
Exercise 235 
1 1 
Let a : R? — R? be a linear transformation satisfying œ 1 = ; 
0 -1 


1 0 3 
a 0 =]1],anda —1 = | 3 |. Finda vector v € R? for which 
3 
1 
0 
0 
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Exercise 236 
Let F be a field and let V be the subspace of F[X] consisting of all polynomials 
of degree at most 2. Let a: V > F[X] be a linear transformation satisfying 
a(1) = X,a(X +1) = X? + X3, and a(X2 + X + 1) = X4 — X? +1. What is 
a(X2 — X)? 


Exercise 237 
For each d € R, let wg : R? > R? be the function defined by 


[a a+b+d*+1 
Ag: b > a 7 


Is there a number d having the property that ag is a linear transformation? What 
if we consider ag as a function from GF Gy to itself? 


Exercise 238 
For each d € R, let ay : R? > R? be the function defined by 


alilo 5da — db 
2b 8d? — 8d — 6 |" 
Is there a number d having the property that wg is a linear transformation? 


Exercise 239 

Let V and W be vector spaces over Q and let a: V > W be a function satisfying 
a(v+v’)=a(v) + a(v’) for all v, v’ € V. Is œ necessarily a linear transforma- 
tion? 


Exercise 240 
Let a : R > R be a continuous function which satisfies æ (a + b) = a (a) + æ (b) 
for all a, b € R. Show that « is a linear transformation. 


Exercise 241 

Let W and W’ be subspaces of a vector space V over a field F and assume 
that we have linear transformations a: W —> V and £ : W’ = V satisfying the 
condition that a(v) = B(v) for all v € W N W’. Find a linear transformation 
0 : W + W' —> V, the restriction of which to W equals œ and the restriction of 
which to W’ equals 6, or show why no such linear transformation exists. 


Exercise 242 
Let F = GF(3) and let 6 € F” be the function defined by 0(0) = 0, 0(1)=2, 
and 0(2) = 1. Let n be a positive integer and let œ : F” —> F” be the function 
ay 0 (a1) 
defined by æ: | : | => : . Is w a linear transformation? 
an O(an) 
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Exercise 243 
Does there exist a linear transformation a : Q* > Q[X] satisfying 


1 —1 —1 

2 1 4 

= — — 9 

0 2, a 1 X, and a 2 X +1? 
—1 1 1 


Exercise 244 
Let B be a Hamel basis for R as a vector space over Q and let 1 4a € R. Show 
that there exists an element y € B satisfying ay ¢ B. 


Exercise 245 
For which nonnegative integers h is the function œ from GF(3}° to itself defined 


a a! 
bya:| b |>| b | a linear transformation? 
h 
c c 


Exercise 246 
For any field F, let 0: F —> F be the function defined by 


| 0 ifa=0, 

O:atry _1 : 

a otherwise. 

This is clearly a linear transformation when F = GF(2). Does there exist a field 
other than GF(2) for which @ is a linear transformation? 


Exercise 247 

Let V = F” and let a: V > V be the function that assigns to each se- 
quence [a),a2,...] E V its sequence of partial sums, namely [a], a2, ...] => 
[a1, aa di, Y q;,...]. Is œ a linear transformation? 


Exercise 248 
Let Y = R® x R. Is the function a : Y > R defined bya: (f,a)—> f(a) a linear 
transformation? 


Exercise 249 
Let F be a field and let b and c be nonzero elements of F. Leta: F > F™ be 
the linear transformation defined by 


æ : [a1, a2, ...] > [a3 + baz + cai, a4 + baz + cap, ...]. 


Let y € ker(@) be a vector satisfying the condition that two successive entries in 
y equal 0. Show that y = [0, 0,...]. 
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Exercise 250 
Consider the field F = QVZ) as a Q-algebra. Show that the only homomor- 


phisms of Q-algebras from F to itself are the identity function and the function 


atbV2~ a—bvV2. 


Exercise 251 

Let V, W, and Y be vector spaces finitely-generated over a field F and let 
a:V— W be a linear transformation. Show that the set of all linear transfor- 
mations f : W — Y satisfying the condition that Ba is the 0-transformation is a 
subspace of Hom(W, Y), and calculate its dimension. 


Exercise 252 

Let V and W be vector spaces over a field F and let V’ be a proper subspace 
of V. Are {a € Hom(V, W) | ker(a) C V’} and {a € Hom(V, W) | ker(w) D> V’} 
subspaces of Hom(V, W)? 


Exercise 253 

Let V and W be vector spaces over a field F and assume that there are sub- 
spaces V; and V2 of V, both of positive dimension, satisfying V = V; ® V2. For 
i= 1,2, let U; = {a € Hom(V, W) | ker(a) 2 V;}. Show that {U;, U2} is an inde- 
pendent set of subspaces of Hom(V, W). Is it necessarily true that Hom(V, W) = 
Ui @U2? 


Exercise 254 
Let F be a field, and let a : M2x2(F) > Mnxn(F) be a homomorphism of 


F-algebras for some n > 1. Show that a (i al) Al, 


Exercise 255 

Let V and W be vector spaces over a field F and let a, 8 : V — W be linear 
transformations satisfying the condition that for each v € V there exists a scalar 
Cy € F (depending on v) satisfying B(v) = cya(v). Show that there exists a scalar 


c satisfying f = ca. 


Exercise 256 


Let V and W be vector spaces over a field F. Define a function g : Hom(V, W) > 


Hom(V x W, V x W) by setting g(a) : | > me 


mation of vector spaces over F? Is it a monomorphism? 


| Is ọ a linear transfor- 
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Exercise 257 
Find the kernel of the linear transformation a : R° —> R? defined by 


a 
b b+c—2d+e 

a c |m | a+2b+3c-— 4d 
d 2a +2c— 2e 
e 


Exercise 258 
Let a : R? — R? be the linear transformation defined by 


a 2a+4b—c 
œa:| b | |> (0) 
c 3c+2b—a 


Are im(q@) and ker(a) disjoint? 


Exercise 259 


2 1 1 

—1 (0) 1 

Let W be the subspace Q oblot lo 
1 1 1 


the linear transformation defined by setting a : | a+2b+c 


—a—2b—-—c 


| of Q* and let a: W > Q? be 
a 

a |: Fina 
c 

d 


basis for ker(q@). 


Exercise 260 

Let F = GF(3) and let a : F? —> F? be the linear transformation defined by 
a a+b 

œ: | b || | 2b+c |. Find the kernel of a. 
c 0 


Exercise 261 
Let a : R4 — R? be the linear transformation defined by 


7 2a+4b+c-d 
Qa: > 3a+b—2c 
a a+5c+4d 


Do there exist a, b, d € Z such that E€ ker(a)? 


QxAS8 
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Exercise 262 

Let V and W be vector spaces over a field F. Let a € Hom(V, W) and 
B € Hom(W, V) satisfy the condition that aBa = a. If w € im(a@), show that 
a~! (w) = {B(w) + v — paw) | ve V}. 


Exercise 263 

Let V, W, and Y be vector spaces over a field F and let a € Hom(V, W) and 
B € Hom(W, Y) satisfy the condition that im(q) has a finitely-generated comple- 
ment in W and im(f) has a finitely-generated complement in Y. Does im(6a) 
necessarily have a finitely-generated complement in Y? 


Exercise 264 
Let a : M3,3(R) > R be defined by a : [a;;] => Tii 4 aij. Show that a 
is a linear transformation and find a basis for ker(q@). 


Exercise 265 


Let F = GF(2) and let n > 2 be an integer. Let W be the set of all vectors 

an 
in F” having an even number of nonzero entries. Show that W is a subspace of 
F” by showing that it is the kernel of some linear transformation. 


Exercise 266 

Let A and B be nonempty sets. Let V be the collection of all subsets of A and 
let W be the collection of all subsets of B, both of which are vector spaces 
over GF(2). Any function f : A > B defines a function æf : W —> V by set- 
ting æf : D> {a € A | f(a) € D}. Show that each such function «œp is a linear 
transformation, and find its kernel. 


Exercise 267 
Let V be a vector space over a field F and let a : V? —> V be the function defined 


vi 

bya: | v2 | > vı + v2 + v3. Show that æ is a linear transformation and find its 
U3 

kernel. 

Exercise 268 


Let n be a positive integer and let V be the subspace of R[X] composed of all 
polynomials of degree at most n. Let a: V — V be the linear transformation 
given by a: p(X) p(X + 1) — p(X). Find ker(@) and im(a@). 
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Exercise 269 
Let œ : R? —> R? be the linear transformation given by 


a a+b+c 
«:| b|| —a— c 
la b 


Find ker(@) and im(q@). 


Exercise 270 
Find the kernel of the linear transformation «œ : Q[X] — R defined by a: 
p(X) > p(V3). 


Exercise 271 
Let V = C(0, 1). For each positive integer n, we define the nth Bernstein function 
Bn: V > R[X] by 


= n! k\ os oh 
iif Yaga (Ge ae 


Show that each £, is a linear transformation and find (4 ker(B,,). (Note: the 
Bernstein functions are used in building polynomial approximations to continu- 
ous functions.) 


With kind permission of the Archives of the Mathematisches Forschungsinstitut Ober- 
wolfach. 


Sergei Natanovich Bernstein was a twentieth-century Russian math- 
ematician who worked mostly in probability theory. 


Exercise 272 
Let V and W be nontrivial vector spaces over a field F. Show that W = 
>“ {im(@) |œ € Hom(V, W)}. 


Exercise 273 

Let W be the subspace of RÈ consisting of all twice-differentiable functions and 
let æ : W —> RË be the linear transformation a: f +> f”. Find w~!(fo), where 
fo € R® is defined by fo: xe x41. 


Exercise 274 

Let W be the subspace of RË consisting of all differentiable functions and let 
a: W —> RÈ be the function defined by a(f): xh f'(x) + cos(x) f (x). Show 
that « is a linear transformation and find its kernel. 
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Exercise 275 

Let n be a positive integer and let V be a vector space over C. Does there ex- 
ist a linear transformation «œ : V —> C” other than the 0-function satisfying the 
condition that im(œ) C R”? 


Exercise 276 

Let V and W be vector spaces over a field F and let a: V —> W be a linear 
transformation other than the 0-function. Find a linear transformation 6 : V > W 
satisfying im(@) = im(6) ~im(a@ + £). 


Exercise 277 

Let V be a finite-dimensional vector space over a field F and let a, B € 
Hom(V, V) be linear transformations satisfying im(@) + im(6) = V = ker(@) + 
ker(8). Show that im(@) Nim(f) = {Oy} = ker(@) N ker(). 


Exercise 278 

Let V, W, and Y be vector spaces over a field F and let a € Hom(V, W) and 
B € Hom(W, Y) satisfy the condition that ker(@) and ker(f) are both finitely 
generated. Is ker(Ba) necessarily finitely generated? 


Exercise 279 
Find a linear transformation a : Q? > Q* satisfying 


0.5 2 

ed 1 
im(a) = Q 3 l> A 
o| |—4 


Exercise 280 
Let F = GF(2) and let wa € Hom(F’, F?) be given by 


ai a4 +a5 + a6 + a7 
: > | a +a3+46+47 
ar a) +a3 +45 + a7 


If v is a nonzero element of ker(œ), show that at least three entries in v are equal 
to 1. 


Exercise 281 

Let V and W be vector spaces finitely-generated over a field F and let 
a €Hom(V, W). If Y is a subspace of W, is it true that dim(a~!(Y)) > 
dim(V) — dim(W) + dim(Y)? 


Exercise 282 
Let V be a vector space over a field F and let Y = V™. Let W be the subspace 
of Y consisting of all those sequences [v1, v2,...] in which v; = 0 for all odd i 
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and let W’ be the subspace of Y consisting of all those sequences in which v; = 0 
for all even 7. Find a linear transformation from Y to itself, the kernel of which 
equals W and the image of which equals W’. 


Exercise 283 
ai 


Let W be the subspace of RÉ composed of all vectors : satisfying 
a6 
Tii ai = 0. Does there exist a monomorphism from W to R4? 


Exercise 284 

Let n be a positive integer and let a : Q” —> Q” be a linear transformation which 
is not a monomorphism. Does there necessarily exist a nonzero element of ker (œ) 
all the entries of which are integers? 


Exercise 285 
Let n be a positive integer and let W be the subspace of C[X] consisting of all 


polynomials of degree less than n. Let a1, ..., an be distinct complex numbers 
pla) 

and let a: W —> C” be the function defined by a : p(X) => : -Isa@a 
plan) 


monomorphism? Is it an isomorphism? 


Exercise 286 

Let V be a vector space over a field F and let a: V —> V be a linear transfor- 
mation satisfying the condition that œ? = aw + bo;, where a and b are nonzero 
scalars. Show that œ is a monomorphism. 


Exercise 287 

Let p be a prime integer and let F be a field of characteristic p. Let (K, e) be an 
associative and commutative unital F -algebra and let æ : K — K be the function 
defined by a: vt» vP”. Show that œ is an isomorphism of unital F'-algebras. 


Exercise 288 
Let F be a field and let K and K’ be fields containing F. Show that every homo- 
morphism of F-algebras K —> K’ is a homomorphism of unital F'-algebras. 


Exercise 289 
Let F = GF(7). How many distinct monomorphisms can one define from F? 
to F4? 


Exercise 290 
Let V and W be vector spaces over a field F and let a, 8 € Hom(V, W) be 
monomorphisms. Is œ + 6 necessarily a monomorphism? 
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Exercise 291 

Let F be a field and let F’ be a field containing F. Let (K, e) be an F-algebra 
and let œ : F’ > K be a nontrivial homomorphism of F-algebras. Show that a 
is monic. 


Exercise 292 

Let V and W be vector spaces over a field F and let a € Hom(V, W) be an 
epimorphism. Show that there exists a linear transformation 6 € Hom(W, V) 
satisfying the condition that wf is the identity function on W. 


Exercise 293 

Let V, W, and Y be vector spaces over a field F and let a € Hom(V, W) be 
an epimorphism. Show that for each linear transformation 8 € Hom(Y, W) there 
exists a linear transformation 0 € Hom(Y, V) such that 8 = a. 


Exercise 294 

Let V, W, and Y be vector spaces over a field F and let a € Hom(V, W) be a 
monomorphism. Show that for each linear transformation 6 € Hom(V, Y) there 
exists a linear transformation 0 € Hom(W, Y) such that 6 = 0a. 


Exercise 295 

Let V be a vector space finitely-generated over a field F, the dimension of which 
is even. Show that there exists an isomorphism a : V —> V satisfying the condi- 
tion that œ? (v) = —v for all v € V. 


Exercise 296 

Let a: V —> W be a linear transformation between vector spaces over a field F 
and let D be a nonempty linearly-independent subset of im(@). Show that there 
exists a basis B of V satisfying the condition that {æ (v) | v € B} = D. 


Exercise 297 

Let V and W be vector spaces over a field F and let a e Hom(V, W) satisfy the 
condition that æf« is not the 0-function for any linear transformation 6B : W > V 
which is not the 0-function. Show that œ is an isomorphism. 


Exercise 298 

Let F be a field and let a : F? —> F[X] be the linear transformation defined by 
a 
b | > (a+b)X + (a+c)X°. Find the nullity and rank of æ. 
c 


Exercise 299 
Let F be a field and let p(X) = X? +bX +c € F[X] be a polynomial having dis- 
tinct nonzero roots dı and d in F. Let œ : F? — F be the linear transformation 
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a 
defined by a: | a2 | +> a3+baz+ca, and let B: F —> F be the linear trans- 
a3 
a1 a2 
formation defined by £ : [a1, a2, a3, ...] œ> | œ a2 5a a3 Jats 
a3 a4 


Show that the nullity of £ is at least 2. 


Exercise 300 
Let 2 be a nonempty set and let V be the collection of all subsets of 2, con- 
sidered as a vector space over GF(2). Show that this vector space is isomorphic 
to GF(2)?. 


Exercise 301 

Let F be a field and let V be the subspace of F% consisting of all sequences 
[a1, a2, a3, ...] in which a; = 0 for all even i. Let W be the subspace of F°° 
consisting of all sequences [a1 , a2, a3, ...] in which a; = 0 for all odd i. Show 
that V X FC S W. 


Exercise 302 
Let V be a vector space over a field F having subspaces W and W’. Let 


Y= {| | | w € W and w' € w'|, which is a subspace of V?. Leta: Y > V 


be the linear transformation defined by a : a > w+ w'. Find the kernel of œ, 


and show that it is isomorphic to W N W”. 


Exercise 303 

Let V be a vector space over a field F. Let W be a subspace of V and let W’ be 
a complement of W in V. Let a: W > W’ be a linear transformation. Show that 
W isomorphic to the subspace Y = {w +a(w) | w € W} of V. 


Exercise 304 
Show that there is no vector space over any field F having precisely 15 elements. 


Exercise 305 
Let F be a field and let V = F[X]. Show that V & V?. 


Exercise 306 

Let V, W, and Y be vector spaces over a field F. Let {a1,..., Œn } be a finite sub- 
set of Hom(V, W) and let 6 € Hom(V, Y) be a linear transformation satisfying 
(yet ker(a@;) C ker(6). Show that there exist linear transformations y1, ..., Yn 
in Hom(W, Y) satisfying B = } }_] via. 
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Exercise 307 

Let V be a vector space over a field F and let W be a subspace of V. For each 

v e€ V, let v+ W = {v+ w |w e W}. Let V/W be the collection of all the 

sets of the form v + W for v € V and define operations of addition and scalar 

multiplication on V/W by setting (v + W) + (v' + W) = (v + v^) + W and 

c(v + W) = (cv) + W for all v, v’ € V and c € F. Show that: 

(1) v+ W =v + W if and only if v — v' € W; 

(2) V/W, with the given operations, is a vector space over F; 

(3) The function v +> v + W is an epimorphism from V to W, the kernel of 
which equals W; 

(4) Every complement of W in V is isomorphic to V/W; 

(5) fi[v+ W]A [v + W] 42 thenv+ W=0'4+W. 

The space V/W is called the factor space of V by W. 


Exercise 308 

Let F be a field and let m > n be positive integers. Let A and B be fixed matrices 
in Mnxm(F) and let 0: Minxn(F) > Mnxm(F) be the linear transformation 
defined by 0 : Ct» ACB. Show that @ is not an isomorphism. 


Exercise 309 

Let F be a field and let K and L be fields containing F as a subfield. Show 
that the set of homomorphisms of unital F'-algebras from L to K is a linearly- 
independent subset of the vector space K} over K. 


Exercise 310 

Let V and W be vector spaces over a field F, with V finitely generated, and let 
Y be a proper subspace of V. Let a e Hom(V, W) and let £ be the restriction of 
a to Y. Show that either ker(8) C ker(@) or im(6) C im(@). 


Exercise 311 

Let V be a vector space over a field F, and let U C W be subspaces of V. Assume 
that there exist x, y € V satisfying the condition that the affine sets x + U = 
{x +u|ueU}and y+ W = {y+ w |w e W} have a vector in common. Show 
thatx +U Cy+W. 


Exercise 312 

Let V and W be vector spaces over a field F. A function f : V —> W is linearly 

independent if and only if gr( f) is a linearly-independent subset of V x W. 

(1) Show that if f : V —> W is linearly independent and if œ € Hom(V, W) then 
f +a is linearly independent. 

(2) Show that no linear transformation is linearly independent. 


Exercise 313 

Let V and W be a vector spaces over a field F. A linear transformation a : V > 
W is said to have algebraic degree n if and only if the set {v, a(v),...,a@”"(v)} 
is linearly dependent for any v € V, but there exists an element vp of V such 
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that the set {vo, æ (vo), ..., a"! (vo)} is linearly independent. Find the algebraic 
a a+2b 
b b-c 
degree of & € Hom(R5, R5) defined bya: | c | > a 
d c—a 
e c 


Exercise 314 

Let n be a positive integer and let F be a field the characteristic of which does not 
divide n. Let W be the subspace of Mnxn(F) generated by {AB — BA | A, B€ 
Mnxn(F)}. Show that dim(W) =n? — 1. 


Exercise 315 
Let V be a vector space finitely generated over a field F and let a e Hom(V, V). 
Show that there exists a positive integer t satisfying V = im(a‘) @ ker(a’). 
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Let V be a vector space over a field F. A linear transformation a from V to itself 
is called an endomorphism of V. We will denote the set of all endomorphisms of 
V by End(V). This set is nonempty, since it includes the functions of the form 
Oc: ut cv for c € F. In particular, it includes the 0-endomorphism o9 : v œ> Oy 
and the identity endomorphism oj : vt v. If V is nontrivial, these functions are 
not the same. We see that we have two operations defined on End(V): addition 
and multiplication (given by composition). Indeed, as a direct consequence of the 
definitions we conclude the following: 


Proposition 7.1 Jf V is a nontrivial vector space over a field F, then End(V) 
is an associative unital F -algebra with og being the identity element for ad- 
dition and o; being the identity element for multiplication. 


If V is a nontrivial vector space over a field F then there exists a function 
o : F + End(V) defined by o : c > oc for all c e F. This function is monic, for 
if oc = og then for any Oy Æ v € V we have cv = 0, (v) = og(v) = dv and hence 
(c — d)v = Oy. Since Oy Æ v, this implies that c — d = 0 and so c = d. Moreover, 
if c,d € F then oc + og = Oc+a and 0-0g = Oca so o is a monic homomorphism 
of unital F'-algebras. We can use this function to identify F with its image under o 
and consider it a subalgebra of the F-algebra End(V). 

Ifa, 6 € End(V) andifc € F, then we have already seen that the functions a+ 6, 
aB, and ca all belong to End(V). Therefore, we see that if p(X) = baer. aiX'e 
F[X] then p(a) = a aja’ is an endomorphism of V, and, indeed, the set F [œ] 
of all endomorphisms of V of this form is an F-subalgebra of End(V). The func- 
tion from F[X] to F[a] given by p(X) +» p(q@) is immediately seen to be an epic 
homomorphism of unital F'-algebras for any œ € End(V). 


Example Let F = GF(2) and let p(X) = X? + X € F[X]. Then p(a) = 0 for every 
a € F. However, p(a) ~ oo, where a € End(F7) is defined by a : H > Hl 
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Example Structures of the form F[a] are important in many areas of mathematics. 
For example, let V be the collection of all infinitely-differentiable functions from R 
to R and let ô be the differentiation endomorphism on V. If p(X) = ~"_) ai X' € 
R[X], then we have p(6): f œ> aof +L af fl, where f! denotes the ith 
derivative of f. Such an endomorphism is called a differential operator with con- 
stant coefficients on V. If c € R and if fe € V is the function given by fe : xh e®™ 
then (fe) = cfe and so p(ô) : fe => D =0 li C che = (Xio 4i) fe = phc) fe. 
Thus, p(6) is the 0-function whenever c is as root of p(X). Hence fe € ker(p(6)) 
for each root c of p(X). 


Example Let V be the convolution algebra on R and let h € V be the constant 
function t +> 1. Then h defines an endomorphism of V given by f > h x f, called 
the integration endomorphism since h x f : t => J fu)du. 


Example Let F be a field and let (K, e) be a nonassociative F-algebra. An en- 
domorphism ô € End(K) is a derivation if and only if ô(v e w) = [ô (v)] e w + v è 
[5(w)]. Thus, for example, if K is a Lie algebra then, as a consequence of the Jacobi 
identity, we see that every y € K defines a derivation ôy of K given by ôy : v > yeu. 
Also, if K is the R-algebra consisting of all infinitely-differentiable functions in RÈ, 
then the endomorphism of K which assigns to each function in K its derivative is 
a derivation. The set of all derivations defined on K is a subspace of End(K). If 6 
and 5’ are derivations on K, then 56’ is not, in general, a derivation on K, but the 
Lie product 56’ — 6’6 is always a derivation on K, and so the set of all derivations 
on K is a Lie algebra over F. 


Given a nontrivial vector space V over a field F, we note that the F-algebra 
End(V) is neither necessarily commutative nor necessarily entire, as the following 


examples show: 


Example Let F bea field and let V = F?. Leta, 6 € End(V) be the endomorphisms 


a b a a a b 
defined bya:| b | > | a | and 6: | b | => | 0 |. Then a: | b |r | O | and 
c c c 0 c 0 
a 0 
aß: | blr |a|,sopaxap. 
c 0 


Example Let F bea field and let V = F?. Leta, 6 € End(V) be the endomorphisms 


a 0 a a 
defined by œ : | b | œ> | O | and £: | b | |> | O |. Then Ba = oo =a. 
c c c 0 


We do, however, have the following: 
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Proposition 7.2 Let V be a vector space over a field F. Then for all 
a € End(V) and all c € F we have ao, = oca. 


Proof If v € V then ao, (v) = a (cv) = ca (v) = oca (v). 


An endomorphism of a vector space V over a field F which is also an isomor- 
phism (i.e., which is both monic and epic) is called an automorphism of V. Since 
a(Oy) = Oy for any endomorphism « of V, we see that any automorphism of V in- 
duces a permutation of V x {Oy}. Similarly, a homomorphism of F-algebras which 
is also an isomorphism is an automorphism of F-algebras. 

By what we have already seen, we know that a € End(V) is an automorphism 
if and only if there exists an endomorphism a! € End(V) satisfying «a7! = 
o; =œ læ. We will denote the set of all automorphisms of V by Aut(V). This set 
is nonempty, since o} € Aut(V), where o, l> o1. Moreover, if a, 6 € Aut(V) then 
(ap)(67!aT!) = a(BB-!)a-! = aT! = oj and similarly (B~!a~!)(@B) = o1. 
Thus wf € Aut(V), with (aB)~! = B-'a™!. It is also clear that if œ € Aut(V) 
then a! € Aut(V). If a € Aut(V) and 0 Æ c € F, then ca € Aut(V) and 
(ca)! = cla]. 


Example Let V be a vector space over a field F and let n > 1 be an integer. Any 


permutation z of the set {1,...,n} defines an automorphism a, of V” given by 
UI Ux (1) 
v2 Ux (2) : ; ; 

Qx : |e . which rearranges the entries of each vector according to 
Un Un (n) 


the permutation zr. More generally, if V is a vector space over a field F having a 
basis B = {v; | i € 2} and if z is a permutation of 2, then there is an automorphism 
of V defined by J` 5-4 divi > Doe, Gn(iy Umi) for each finite subset A of 2. 


Example Let F be a field and let n be a positive integer. We have already seen 
that the function Ate AT is an automorphism of Mpnxn(F), considered as a vector 
space over F. 


Example Let V be a vector space having finite dimension n over a field F and 
let v and y be nonzero elements of V. Then there exist bases {v1,..., Vn} and 
{y1,---,; Yn} of V satisfying vı = v and yı = y. The function a: V —> V defined 
by æ : 07, aivi œ> J; iyi is thus an automorphism of V satisfying a(v) = y. 


Let V be a vector space over a field F and let n be a positive integer. We will list 
several types automorphisms, called elementary automorphisms, of a vector space 
of the form V”. These automorphisms will play an important part in our ensuing 
discussion. 
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(1) If 1 <hFk <n, we define ¢,, E€ Aut(V”) by 


vi Wy vk ifi=h 
: |e , where w; = { vp ifi=k, 

; Otherwise. 
Un Wn vi Ww 


This automorphism satisfies E = Ehk- 
(2) If 1 <h <n, and if 0 Æc € F, we define én.- € Aut(V”) by 


vI w1 
cv; ifi=h, 


=> > |, where w; = : 
, v; otherwise. 


Un Wn 


This automorphism satisfies br. = Ep e-l. 
(3) If 1<h#k <n andifc e F, we define £€rk:c € Aut(V”) by 


v1 w1 ; 
vietcu, ifi=h, 


=> : where w; = : 
a |? í | vi otherwise. 


Vn Wn 
s š P =i 
This automorphism satisfies Enk:c = Ehk;—c- 
Identifying the automorphisms of a finite-dimensional vector space V over a field 


F is a problem which will be of major importance to us later, and so it is important 
to characterize these functions. 


Proposition 7.3 Let V be a vector space of finite dimension n over a field F. 
Then the following conditions on an endomorphism a of V are equivalent: 
(1) @ is an automorphism of V ; 

(2) & is monic; 

(3) @ is epic. 


Proof By definition, (1) implies (2). Now assume (2). By Proposition 6.10, we see 
that the rank of a equals n and so im(a) = V by Proposition 5.11, proving (3). 
Now assume (3). By Proposition 6.10, we see that the nullity of œ equals n — n = 0 
and so ker(a) = {Oy}, proving that œ is monic as well, and so is bijective. This 
proves (1). 


Proposition 7.4 Let V be a finite-dimensional vector space over a field F and 
let a € End(V). If there exists a p € End(V) satisfying aB = o; or Ba =o}, 
then a € Aut(V) and B =a7!. 
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Proof If Ba = o then ker(@) C ker(o,) = {Oy} and so, by Proposition 7.3, 
a € Aut(V). Similarly, if «f = o; then im(@) 2 im(o;) = V and so, by Proposi- 
tion 7.3, a € Aut(V). Moreover, if ~f = o; we see that a~!=a-!o, =a7! (ap) = 
B and similarly œ! = B when Ba = 0}. 


Example Proposition 7.3 and Proposition 7.4 are no longer true if we remove the 
condition of finite dimensionality. For example, let F be a field and let V = F[X]. 
Define the endomorphisms « and £ of V by setting a : X} f oai Xİ > Xi oa; Xit! 
and B: X; oai Xİ > X ;_ a;X'!. Then a, B ¢ Aut(V), despite the fact that œ is 
monic and £ is epic. Moreover, Ba = 0; but aß Æ o1. 


Let V be a vector space over a field F and let a € End(V). A subspace W of V 
is invariant under « if and only if (w) € W for all w € W or, in other words, if and 
only if a(W) C W. Thus, W is invariant under œ if and only if the restriction of œ 
to W is an endomorphism of W. It is clear that V and {Oy} are both invariant under 
every endomorphism of V. If œ € End(V) then im(«) and ker(q@) are both invariant 
under @. 


Example Let F be a field and, for each positive integer k, let Wg be the sub- 
space of F[X] composed of all polynomials of degree at most k. Let 6 be the for- 
mal differentiation endomorphism of F[X], namely the endomorphism defined by 
ô: pa; X! > } `; oia;Xi!. Then each of the subspaces Wọ is invariant un- 
der 5. Now assume that F is of characteristic 0. If p(X = yg ai X! € Wy and if 
a € F thenit easy to check that p(X) = p(a) + ie 1 al [s" (p) (a) | (X — a)". The 
coefficients 7; L [8 (p)(a)] are known as the Taylor coefficients of p(X) around a. 


Example Let V = R? and let œ be the automorphism of V defined by 


a: | = El Let W be a proper subspace of V which is invariant under œ. 


Then dim(W) < 1 and so there exists a vector w = a satisfying W = Rw. Since 


a(w) = E it follows that there exists a real number e such that æ (w) = ew. 

That is to say, ec = d and ed = —c. From this we learn that ce? = —c, and so 

c = d = Q. This proves that W = Ii | and so we see that V has no proper 

nontrivial subspaces invariant under a. 

Example Let F be a field and let n be a positive integer. Let æ be the automorphism 
al an 


a al 
of F” defined byæ: | . | => . . A subspace W invariant under « is cyclic. 


an Gn-1 
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Cyclic subspaces of F”, where F is a finite field, are important in defining certain 
families of error-correcting codes. 


Let F be a field. An element a of an F-algebra (K, e) is idempotent if and only 
if a? =a. If V is a vector space over F, then an idempotent element of End(V) is 
called a projection. Note that if œ € End(V) is a projection and if w = a(v) € im(@) 
then a(w) = g? (v) = a (v) = w, so that the restriction of « to its image is just o1. 
The converse is also true. If œ € End(V) satisfies the condition that the restriction of 
a to its image is just 01, then for each v € V we have a? (v) = a(a(v)) = oj (a(v)) = 
a(v) and so @ is a projection. 


Example If F is a field then the endomorphism of F? defined by 


a 3a — 2c 
b |= | —a+b+c 
c 3a — 2c 


is a projection. 


Example The sum of two projections need not be a projection. For example, if 
V = R? then the endomorphisms « and £ of V defined by 


a a a (0) 
a:|b|=>|b and :|b|m=>|b 
c 0 c c 


are projections, but œ + £ is not a projection. 


Example If W is a subspace of a vector space V over a field F having a complement 
Y in V, we know that every element v € V can be written in a unique way in the form 
w + y, where w € W and y € Y. The endomorphism of V defined by v > w is a 
projection the image of which is W. Statisticians often consider data in V = R” and 
use a projection in End(V) to project it onto a subspace W of V that best preserves 
the variance in the data. This standard method in data analysis is called principle 
component analysis and there exist several efficient algorithms for performing it. 


In fact, all projections of a vector space are of the form in the previous example, 
as the following example shows. 


Proposition 7.5 Let V be a vector space over a field F and let a € End(V) 
be a projection. Then V = im(a@) Q ker(q@). 


Proof If v € im(a) Mker(q@) then there exists an element y of V satisfying v = a(y) 
and so v = a(v) = Oy. Thus im(q@) and ker(q@) are disjoint. If v is an arbitrary 
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vector in V then v = [v —a(v)]+a(v) € ker(a) +im(q@). Therefore, V = im(a) @ 
ker(a@). 


Proposition 7.6 Let V be a vector space over a field F and let a € End(V). 
A subspace W of V is invariant under a if and only if Bap = af for each 
projection B of V the image of which is W. 


Proof Assume that W is invariant under œ and let 6 be a projection of V the im- 
age of which is W. By Proposition 7.5, we have V = W @ ker(f). If v e V, we 
can therefore write v = w + y, where w € W and y € ker(8). Hence æf (v) = 
ap(w)+aB(y) = a(w)+0v =a(w) = Ba(w) = pap (v), showing that Bap = af. 
Conversely, if BaB = af for each projection 6 of V the image of which is W 
then, for each such 6, we have w = (w) for all w € W and so a(w) = œf (w) = 
BaB(w) € W, showing that W is invariant under a. 


Proposition 7.7 Let V be a vector space over a field F and let {W,..., Wn} 

be a set of subspaces of V . Then the following conditions are equivalent: 

d) V=W, 9- Wa; 

(2) There exist projections o,...,0@, in End(V) with W; = im(qa;) for 
all 1 <i <n, which satisfy the conditions aja; = oo for i  j and 
A, +H: +A, =O}. 


Proof (1) = (2): From (1) it follows that every v € V can be written in a unique 
manner as a= ı Wi, Where w; € W; for all 1 <i <n. Define g; to be the projection 
v > wi for each i. It is easy to verify that these linear transformations do indeed 
satisfy the required conditions. 

(2) > (1): Since a +-+- + æn = 01, we surely have V = } ;_;im(œ;) = 
X; Wi. If 0y AVE WAN Djan Wj; then there exists an i # h such that œ; (v) # 
Oy. But æn (v) = v so aja, Æ o0, a contradiction. Therefore, Wp N D jih W; = {0y} 
for each 1 < h < n, proving (1). 


Proposition 7.8 Any two complements of a subspace W of a vector space V 
over a field F are isomorphic. 


Proof Let U and Y be complements of W in V. By Proposition 7.7, we know that 
there exists a projection 6 € End(V) the image of which is U and the kernel of 
which is W. Let œ be the restriction of 6 to Y. The linear transformation @ is a 
monomorphism since ker(@) C ker(f) NY = W N Y = {0y}. Any vector u € U 
can be written as w + y, where w € W and y € Y, and we have a(y) = B(y) = 
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B(w) + L(y) = (w + y) = (u) = u. Thus we see that «œ is also epic and hence is 
the desired isomorphism. 


We now introduce a notion which is basic in all branches of mathematics. A re- 
lation = defined on a given nonempty set U is called an equivalence relation if and 
only if the following conditions are satisfied: 

(1) u=u forallu eU; 
(2) u =u" if and only if u’ = u; 
(3) If u =u’ and u’ = u” then u =u". 


Example Let B be a nonempty subset of a set A and define a relation =g on A by 
setting a =p a’ if and only if a = a' or both a and a’ belong to B. Then =, is an 
equivalence relation on A. In particular, if W is a subspace of a vector space V then 
the relation =w defined on V by setting v =w v’ if and only if v — v’ € W is an 
equivalence relation on V. 


Example Let V and W be a vector spaces over a field F and let a € Hom (V, W). 
Define a relation = on V by setting v = v’ if and only if (v) = a(v’). This is easily 
seen to be an equivalence relation. 


Let V be a vector space over a field F. A subset G of Aut(V) is a group of 
automorphisms if it is closed under taking products, contains o1, and satisfies the 
condition that a! € G whenever a € G. Clearly, Aut(V) itself is such a group. 
The notion of a group of automorphisms is very important in linear algebra and its 
applications, but here we will only touch on it. 


Example Let V be a vector space over a field F and let a € Aut(V). Then 
{a} | i € Z} is surely a group of automorphisms. 


Example Let V be a vector space over a field F and let 2 be a nonempty set. Every 
permutation 2 of 2 defines an automorphism a of the vector space V? over F 
defined by ag (f): i> f(x(i)) for all i € 2 andall f € V? . The collection G of 
all such automorphisms is a group of automorphisms in Aut(V ?). 


Proposition 7.9 If V is a vector space over a field F and if G is a group of 
automorphisms of V then G defines an equivalence relation ~G on V by set- 
ting v ~g v' if and only if there exists an element a of G satisfying a (v) = v’. 


Proof If v € V then o (v) = v, and so v ~g v. If v, v’ € V satisfy v ~g v’ then 
there exists an element œ of G satisfying a(v) = v’, and so v = aT! (v). Thus 
v’ ~g v. Finally, if v, v’, v” € V satisfy v ~g v’ and v’ ~g v” then there exist 
elements œ and $ of G satisfying a(v) = v’ and f(v’) = v”, and so Ba(v) = v”. 
Thus v ~g v”. 
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Proposition 7.10 Jf V is a vector space over a field F and if G is a group of 
automorphisms of V then all elements of G have the same rank. 


Proof If a € G then, by Proposition 6.11, rk(o1) = rk(aa—!) < rk(æ) = rk(ao}) 
<rk(o,) and so rk(@) = rk(o}). 


Exercises 


Exercise 316 
Let V be a vector space over GF(3). Find an endomorphism a of V satisfying 
a(v)+a(v) =v forallve V. 


Exercise 317 

Let V be a vector space finitely generated over a field F and let a, $, y € End(V). 
Find necessary and sufficient conditions for there to exist an endomorphism 6 of 
V satisfying «yp = Bea. 


Exercise 318 
Let F = GF(2) and let n be a positive integer. Let a: F” —> F” be the function 
ai ai 


defined bya: | : |>| : |, where 0! = 1 and 1’ = 0. Is œ an endomorphism 


of F”? 


Exercise 319 
Let a, B : Q[X] —> Q[X] be defined by a: p(X) > Xp(X) and £ : p(X) > 
X? p(X). Show that a, 6, and a — £ are all monic endomorphisms of Q[X]. 


Exercise 320 

Let V be a finitely-generated vector space over a field F and let æ € End(V). 
Show that œ is not monic if and only if there exists an endomorphism f Æ oo of 
V satisfying aB = o0. 


Exercise 321 
Let V be a vector space over a field F and let a € End(V). Show that ker(@) = 
ker(«?) if and only if ker(œ) and im(q@) are disjoint. 


Exercise 322 
Let V be a vector space over a field F and let a € End(V). Show that im(@) = 
im(a) if and only if V = ker(a@) + im(q@). 


122 7 The Endomorphism Algebra of a Vector Space 


Exercise 323 

Let V be a vector space over a field F and let K = F x V x End(V), which is 
again a vector space over F. Define an operation © on K by setting (a, v, œ) © 
(b, w, p) = (ab,aw + (v), Ba). Is (K, ©) an F-algebra? Is it associative? Is it 
unital? 


Exercise 324 
Let V be a vector space over a field F, and let Aff(V,V) be the set of all 
affine transformations from V to itself. Is Aff(V, V), on which we have defined 
the operations of addition and composition of functions, an associative unital 
F-algebra? 


Exercise 325 


Let œ € Aut(R2) be defined by a: f 


b 
tal subalgebra of End(R?). Show that it is proper by giving an example of an 
endomorphism of R? not in this subalgebra. 


| = E. Show that R{a, o1} is a uni- 


Exercise 326 

Let V be the space of all real-valued functions on the interval [—1, 1] which are 
infinitely differentiable, and let 5 be the endomorphism of V which assigns to 
each function f its derivative. Find the kernel and image of ô. 


Exercise 327 

Let a: C — C be the function defined by wa: a+ bi œ> —b + ai. Is a an endo- 
morphism of C considered as a vector space over R? Is it an endomorphism of 
C considered as a vector space over itself? 


Exercise 328 
Let V be a vector space of finite dimension n over a field F and let æ € End(V). 
Show that there exists an automorphism £ of V satisfying aBa =a. 


Exercise 329 
Let V = M2x2(R), which is a vector space over R. Let a € VV be defined by 
a a a a ; 
el A lanl lan] . Is æ an endomorphism of V? 
a an lazı]  |a22| 
Exercise 330 
Consider R as a vector space over Q and let a be an endomorphism of this space 
satisfying the condition that there exists an ag € R such that @ is continuous at ao. 
Show that «œ is continuous at every a € R. 


Exercise 331 

Let A be a nonempty set and let V be the collection of all subsets of A, con- 
sidered as a vector space over GF(2). For which subsets C of A is the function 
Bt BUC an endomorphism of V? 
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Exercise 332 

Let V be a vector space of finite dimension n over a field F and let {æ;; | 
1 <i, j <n} be a collection of endomorphisms of V, not all of which are equal 
to oo, satisfying the condition that 


Bete = ain if j =k, 
ijækh =) og otherwise. 
Show that there exists a basis {v1, ..., Vn} of V such that 
vj ifi=k, 
Q; i) = i A 
jk (vi) | Oy otherwise. 


Exercise 333 

Let V be a vector space of finite dimension n over a field F and choose an 
element a € End(V). Let g : End(V) —> End(V) be the function defined by 
Bt Ba. This is an endomorphism of End(V), considered as a vector space 
over F. Show that a positive integer n satisfies a” = oo if and only if ” is 
the 0-function. 


Exercise 334 

Let œ be an endomorphism of R? satisfying the condition that œ? = dọ. Show 
that there exists a linear transformation £ : R? > R and that there exists a vector 
y € R? satisfying a (v) = B(v)y for all v € R?. 


Exercise 335 

For each 0 4a € R, let a : C > C be the function defined by Bg : z > z + az. 
Show that 64 is an endomorphism of C considered as a vector space over R, and 
describe its image and kernel. 


Exercise 336 
Let V be a vector space finitely generated over Q and let a, 8 € End(V) satisfy 
3a? + 7a? — 208 + 4a — 01 = o0. Show that af = Ba. 


Exercise 337 

Let F be a field of characteristic other than 2 and let V be a vector space of finite 
dimension n over F. Let æ be an endomorphism of V satisfying the condition 
that æ? = o1. Show that rk(o1 — æ) + rk(o1 +a) =n. 


Exercise 338 

Let V be a vector space over a field F which is not finitely generated, and let 
oo Æ æ € End(V). Set A = {6 € End(V) | a8 = o1}. Show that if A has more 
than one element then it is infinite. 


Exercise 339 
Let V be a vector space over a field F having dimension greater than 1. Show 
that there exists a function œ € VV which is not an endomorphism of V but 
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which nonetheless satisfies the condition that a(av) = aa(v) for all a € F and 
allue V. 


Exercise 340 
Let V be a vector space over a field F satisfying the condition that «f = Ba for 
all a, 8 € End(V). Show that dim(V) = 1. 


Exercise 341 

Let V = M2x2(R), considered as a vector space over R. Let a: V —> V be the 
“|| a+2b+c+2d | 

d 3a+6b+2c+5d a+2b+c+2d | 

Is æ an endomorphism of V? Is it an automorphism of V? 


function defined by a: 


Exercise 342 

Let V be the vector space of all continuous functions from R to itself and let 
æ: V — V be the function defined by a: f(x) bh [x? + sin(x) + 2] f (x). Show 
that œ is an automorphism of V. 


Exercise 343 
Let F be a field and let œ : F[X] — F[X] be the function defined by a : p(X) > 
p(X + 1). Is æ an endomorphism of F[X]? Is it an automorphism? 


Exercise 344 

Let F bea field and, for eacha € F, let 6, be the endomorphism of F'[X] defined 
by 0a : p(X) p(X +a). Let a € End(F[X]) satisfy a(X) € F and a6, = Oga 
for all a € F. Can œ be a monomorphism? 


Exercise 345 


a a—2b 
Let œ € End(R?) be given by a: | b | => c . Is @ an automorphism 
c a—b 


of R?? 


Exercise 346 
Let a be the endomorphism of R defined by 


a i [di a2, a3, ...] > [bi bz, b3,...], 


where bp = Ea ie for each h > 1. Show that œ is an automor- 
phism satisfying wa =a7!. 

Exercise 347 

Let V be a vector space finitely generated over R and let a be an endomorphism 
of V satisfying œ? + 4a? + 2a + 0; = 00. Show that a € Aut(V). 
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Exercise 348 

Let V be a vector space over a field F and let a, 8 € End(V) satisfy wf = 01. Set 
gy = o; — Ba. Show that for every integer n > 1 we have o] = . Be pak + 
Ba" . 


Exercise 349 

Let V be the space of all polynomial functions from the interval [0, 1] on the 
real line to R. Let w and £ be the endomorphisms of V defined by a(f) : x => 
fi f@)dt and (f): xh fe f(@) dt. Find im(« + £). Is it true that «f = Ba? 


Exercise 350 

Let F be a field and let V = F”. Let n > 1 be an integer. Each vector 
dı 

y= | : | € F” defines an endomorphism 6, of V by 0y : [a1, a2, ...] > 
dn 

[b1, bz, ...], where bn = 77) an—14id;, for h = 1,2, .... Show that if 0, is a 

monomorphism then the polynomial p(X) = };_; di X i-l € F[X] has no roots 

in F. 


Exercise 351 
Let F = GF(5) and let V = F?. How many endomorphisms «œ of V satisfy the 


1 2 0 1 
conditions a 0 =| 1 | anda 3 S E 
(0) (0) (0) 1 


Exercise 352 

Let V be the set of all continuous functions from R to itself, which is a vector 
space over R. Let a: V —> V be the function defined by a(f) : x > f (5) for all 
x € Randall f € V. Is œ an automorphism of V? 


Exercise 353 

Let V = R” and let W be the subspace of V consisting of all convergent se- 
quences. Let a € End(V) be defined by « : [a1, a2, ...] > [b1, b2,...], where 
br = ODA ai) for all h > 1. If v € V satisfies æ (v) € W, is v itself necessarily 
in W? 


Exercise 354 

Let V be a vector space over a field F and let æ € Aut(V). Let W1, ..., Wg be 
subspaces of V satisfying V = p, W;. For each 1 <i < k, let Y; = {a (w) | 
w € Wi}. Is V = @% Yi? 


Exercise 355 
Consider R as a vector space over Q. An endomorphism a of this space is 
bounded if and only if there exists a nonnegative real number m(q) satisfying 
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the condition that |a(x)| < m(a)|x| for all x € R. Does the set of all bounded 
endomorphisms of R form an R-subalgebra of End(R)? 


Exercise 356 

Let F be a field, let n be a positive integer, and let V = Mpxn(F). Given a 
matrix B € V, is the function ag : V —> V defined by ag: A œ> AB + BA an 
endomorphism of V? 


Exercise 357 

Let F be a field and let V = F[X]. Let 5 € End(V) be the formal differentiation 
function and let a € End(V) be defined by a: p(X) > Xp(X). Show that ad — 
6a = 0]. 


Exercise 358 
Let V be a nontrivial vector space over a field F. Is the set of all automorphisms 
of V a subspace of the vector space End(V) over F? 


Exercise 359 
Consider GF(3) as a vector space over itself. Does there exist an automorphism 
of this space other than o1? 


Exercise 360 

Let V = F” for some field F. Each w = [c1, c2,...] € V defines a function By : 
V > V by By : [a], a2, ...] => [ai aici + a, (ajc) + a2)c2 + 43,.. Ae Show 
that 6w is an automorphism of V. 


Exercise 361 

Let V be a vector space over a field F; leta~ € End(V) and let $ € Aut(V). Define 
. y2 2 . JX B(v) 

the function 0 : V4 > V* by setting 6: | = E aail Is 6 necessarily 

an automorphism of V7? 


Exercise 362 


= 2 1a 2b 
Let F = GF(5) and let a € Aut(F~) be defined by a : B = p + N Show 


that there exists a positive integer h satisfying o/+! = œ and find the smallest 
such integer h. 


Exercise 363 

Let F be a field of characteristic other than 2. Let V be a vector space over F 
and let a, £, y, ô be endomorphisms of V satisfying the condition that œ — 6 and 
a+ 6 are automorphisms of V. Show that there exist endomorphisms p and y 
of V satisfying ga + wB = y and Ya + p = ô. 
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Exercise 364 

Let V be a vector space of finite dimension n over a field F. Let a € End(V) 
and assume that there exists a vector in y € V satisfying the condition that D = 
{a(y), æ? (y), ..., a” (y)} is a basis for V. Show that D' = {y, a(y),...,@”~!(y)} 
is also a basis for V and that a € Aut(V). 


Exercise 365 
Let F be a field and let V = F™. Let æ be the endomorphism of V defined by 
a(f):ith f@+1) forall f € V. Show that œ — co; ¢ Aut(V) forallO Ace F. 


Exercise 366 

Let V be a vector space of finite dimension n over a field F , and let O < k < n 
be a positive integer. Let Ax be the set of all subspaces of V having dimension k. 
Let a € Aut(V) and, for each W € Ax, let Og (W) = {a(w) | w € W}. Show that 
the function @, is a permutation of Ax. 


Exercise 367 
Let r, s, and ¢ be distinct real numbers and let œ be the endomorphism of R? 


a a+br+cr? 
defined bya: | b | œ> | a + bs + cs? |. Is œ an automorphism of R3? 
c a+bt+ ct? 


Exercise 368 
Let V be a vector space over a field F and let a € End(V). Show that W = 
UPC] ker(a') is a subspace of V which is invariant under a. 


Exercise 369 
Let œ and £ be the endomorphisms of Q4 defined by 


a 2a — 2b — 2c — 2d a 0 
aa b E 5b—c-d and’ BP b vs —b+2c+3d 
“le —b+5c-—d “le 2b — 3c + 6d 
d —b—c+5d d 3b + 6c + 2d 


Find two nontrivial proper subspaces of Q* which are invariant both under œ and 
under £. 


Exercise 370 
Let F be a field and let V = F*. Let œ be the endomorphism of V defined by 


a a—b 
b a—b i ; ; 

a: a |P a=b . Does there exist a two-dimensional subspace of V 
d c—b-d 


invariant under a? 
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Exercise 371 

Let V be a vector space over a field F and let a € End(V). If W and Y are 
subspaces of V which are invariant under a, show that both W + Y and WN Y 
are invariant under g. 


Exercise 372 
Let W be a subspace of a vector space V over a field F and let S be the set of all 
a € End(V) such that W is invariant under œ. Is S necessarily an F-subalgebra 
of End(V)? 


Exercise 373 

Let V be a vector space over a field F and let a € End(V). If W is a subspace of 
V, show that the set of all subspaces of W which are invariant under a, partially 
ordered by inclusion, has a maximal element. 


Exercise 374 

Let V = R” and let W be the subspace of V consisting of all sequences 
[a 1, a2, ...] for which the series pal a; converges. Let o be a permutation of the 
set of all positive integers and let œ € End(V) be defined by @ : [a1, a2,...] > 
[as(1); 402), ---]. Is W invariant under a? 


Exercise 375 

Let V be a vector space over a field F. Let 0 Æ c € F and let œ € End(V). Let 
{x0, X1,---,Xn} be a set of vectors in V satisfying a(xo) = cxọ and æ (xi) — cxi = 
xi—ı for all 1 <i <n. Show that F{xo, x1, ..., Xn} is a subspace of V which is 
invariant under a. 


Exercise 376 

Let F be a field which is not finite and let V be a vector space over F having 
dimension greater than 1. For each 0 Æ c € F, show that there exist infinitely- 
many distinct subspaces of V which are invariant under the endomorphism oec 
of V. 


Exercise 377 

Let V be a vector space of finite dimension n over a field F. Let a € End(V) 
and let 6 € End(V) satisfy B? =a. Find a positive integer k such that rk(6) < 
zitk(a) +n]. 


Exercise 378 

Let æ and # be endomorphisms of a vector space V over a field F and let 
0 € Aut(V) satisfy 0a = BO. Show that a subspace W of V is invariant under 
a if and only if W’ = {9 (w) | w € W} is invariant under £. 


Exercise 379 
Let a and $ be endomorphisms of a vector space V over a field F satisfying 
afp = Ba. Is ker(q) invariant under 6? 
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Exercise 380 
Let V be a vector space over a field F and let œ € End(V) be a projection. Show 
that o; — « is also a projection. 


Exercise 381 
Let V be a vector space finitely generated over a field F and let a € End(V) 
satisfy the condition a*(o; — œ) = 09. Is œ necessarily a projection? 


Exercise 382 

Let V be the space of all continuous functions from R to itself and let W = 
R{sin(x), cos(x)} C V. Let 6 be the endomorphism of W which assigns to each 
function its derivative. Find a polynomial p(X) € R[X] of degree 2 satisfying 


p(8) = 00. 


Exercise 383 
Let V be a vector space finitely generated over Q and assume that there exists an 
a € Aut(V) satisfying a~! = a? + æ. Show that dim(V) is divisible by 3. 


Exercise 384 
ai 


Let n be a positive integer and let G = : | €R") a; >Oforalll<i<n 


an 
Let a be an endomorphism of R” satisfying the condition that (v) € G implies 
that v € G. Show that a € Aut(R”). 


Exercise 385 

Let V be a vector space over a field F and let W and Y be subspaces of V 
satisfying W + Y = V. Let Y’ be a complement of Y in V and let Y” be a 
complement of W N Y in W. Show that Y’ S Y”. 


Exercise 386 

Let F be a field of characteristic other than 2 and let V be a vector space over F. 
Let a, € End(V) be projections satisfying the condition that a + £ is also a 
projection. Show that «f = Ba = oo. 


Exercise 387 
Let V be a vector space over F and let a, p € End(V). Show that œ and 6 are 
projections satisfying ker(œ) = ker(£) if and only if a8 = « and Ba = £. 


Exercise 388 
Let V be a vector space finitely generated over a field F and let a 4 oj be an 
endomorphism of V which is a product of projections. Show that a ¢ Aut(V). 


Exercise 389 
Let V be a vector space over Q and let œ € End(V). Show that «œ is a projection 
if and only if (2a — 01) =o}. 
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Exercise 390 
Let æ and f be endomorphisms of a vector space V over a field F and let 
f(X) € F[X] satisfy f(a@B) = oo. Set g(X) = Xf (X). Show that g(Ba) = oo. 


Exercise 391 
Let V = RE and let g € V. Find necessary and sufficient conditions on g for the 
endomorphism f +> gf of V to be a projection. 


Exercise 392 

Let W be a subspace of a vector space V over a field F which is invariant un- 
der an endomorphism a of V. Let B € End(V) be a projection satisfying the 
condition that im(8) = W. Show that Ba = af. 


Exercise 393 

Let V be a vector space of finite dimension n over a field F and let æ € End(V). 
Show that there exists an automorphism £ of V and a projection 0 of V satisfying 
a= fð. 


Exercise 394 

Let F be a field of characteristic other than 2 and let V be a vector space over F. 
Let a € End(V) be a projection satisfying the condition that œ — £ is a projection 
for all 6 € End(V). Show that a = 01. 


Exercise 395 
Let V be a vector space over F and let a, 6 € End(V) be projections satisfying 
the condition that im(@) and im() are disjoint. Is it necessarily true that ~ = 


Ba? 


Exercise 396 

Let V bea vector space of finite dimension n over a field F and let $ = End(V)\ 
Aut(V). For a, 6 € S, show that im(«) = im(£) if and only if {a0 | 0 € S} = 
{Be |g € sS}. 


Exercise 397 

Let F be a field. Does there exist an endomorphism œ of F? which is not a 
projection satisfying the condition that a7 is a projection equal neither to og nor 
to 01. 


Exercise 398 

Let V be a vector space finite dimensional over a field F and let œ be an endo- 
morphism of V. Show that there exist a positive integer k such that im(a*) and 
ker(a*) are disjoint. 
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Exercise 399 

Let F be a field of characteristic other than 2, and let V be a vector space over F. 
Let a € End(V) satisfy a? = a. Show that V = W; ®@ W2 ® W3, where W; = 
{ve V|a(v) =v}, Wo = {v € V | a(v) = —v}, and W3 = ker(q@). 


Exercise 400 

Let F be a field of characteristic other than 2 and let V be a finitely-generated 
vector space over F. Show that every endomorphism of V is the sum of two 
automorphisms of V. 


Exercise 401 

Let n > 1 be an integer and let 0 : R” —> R be the function defined by 
ai 

@:| : |e Vey a?. Assume that we can define an operation e on R” sat- 


dn 
isfying the condition that (R”, e) is an associative unital R-algebra with multi- 
plicative identity e, and also satisfying the condition that 6 (v e w) = 0 (v)@ (w) 
for all v, w € R”. Show that (R”, +, e) is a division algebra over R. 


Exercise 402 

Any sequence v = [a1,a2,...] € R defines an endomorphism «œ, of R[X] 
which acts on elements of the canonical basis of R[X] according to the rule 
æy : X” > Pro (1) (kKDak+1 X” for each nonnegative integer n. Given a € R, 
find v, w € R” such that œ, : p(X) p(X +a) and ay: p(X) > p(X +a) — 
p(a). 


Exercise 403 

Let V be a vector space over a field F and let G be a group of automorphisms 
of V. For v € V, define the stabilizer of v in G to be Gy = {a € G | a(v) = v}. 
Is this necessarily a group of automorphisms of V? 


Representation of Linear Transformations 8 
by Matrices 


In this chapter, we show how we can study linear transformations between finitely- 
generated vector spaces by studying matrices. Let V and W be finitely-generated 
vector spaces over a field F, where dim(V) = n and dim(W) = k. Fix bases 
B={v1,...,Un} of V and D = {wy ,..., wg} of W. From Proposition 5.4, we know 
that if we are given a linear transformation œ € Hom(V, W) then foreach 1 < j <n 
there exist scalars a1j,...,agj satisfying the condition a(v;) = a aijwi, and 
that these scalars are in fact uniquely determined by a. Thus œ defines a matrix 
[aij] E€ Mkxn(F). Conversely, assume we have a matrix A = [aij] E€ Mkxn(F). 
Then we know that every vector v in V can be written in a unique way in the form 
Xi- bjvj, and so A defines a linear transformation œ € Hom(V, W) by setting 


æ: ve a int ajjb;)w;. Moreover, it is clear that different linear transforma- 
tions in Hom(V, W) define different matrices in Mkgxn(F) and different matrices 
in Mxxn(F) define different linear transformations in Hom(V, W). We summarize 
the above remarks in the following proposition. 


With kind permission of the Special collections, Fine Arts Library, Harvard Univer- 
sity. 

The theory of matrices and their relation to linear transformations was 
developed in detail by the nineteenth-century British mathematician 
Sir Arthur Cayley, one of the most prolific researchers in history. 


Proposition 8.1 Let V be a vector space of finite dimension n over a field 
F and let W be a vector space of finite dimension k over F. For every 
basis B of V and every basis D of W there exists a bijective function 
Ppp: Hom(V, W) > Mexn(F), which is an isomorphism of vector spaces 
over F. 


J.S. Golan, The Linear Algebra a Beginning Graduate Student Ought to Know, 133 
DOI 10.1007/978-94-007-2636-9_8, © Springer Science+Business Media B.V. 2012 
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Proof We have already seen that if B = {v,,..., Un} and D = {w1,..., wx}, then 
the function ®gp is defined by ®gp(a) = [a;;], where a(vj) = yy aj; w; for all 
1 < j <n, and that this function is bijective. We are therefore left to show that this 
is a linear transformation. Indeed, if ®gp(a@) = [aij] and ®gp(B) = [bij] then 


k k k 


(a+ Bus) = X (ajj + bij)wi = X aijwi + X bijwi =a(vj) + (vj) 


i=l i=1 i=l 


for all 1 < j <n, and so @gp(a+ B) = @gp(a) + gpl). Similarly, if c € F 
then (cw)(vj) = X$; cajjwi = eC, aijwi) = c(a(v;)) for all 1 < j <n, and 
so Pgp(ca) = c®gp(a). Thus we see that gp is indeed a linear transformation 
and thus also an isomorphism. 


We have already seen that, in the above situation, dim(Mkxn(F)) = kn and so, 
by Proposition 6.9, we also see that dim(Hom(V, W)) = kn. 


Example Let V = R? and let B be the canonical basis on V. Each vector 


a} 
v = | a | € V defines a linear transformation a, : V —> V given by ay : wb 
a3 
0 -a a 
v x w. Then Øgg(&,)= | a3 0 -a, 
—a a 0 


0.5 0.5 0 
B= —0.5 |, O |,} 0.5 
0 —0.5 0.5 


‘s 
of V and of D= elal of W. If | s | € R? then there exist b1, b2, b3 € R 
t 


r 0.5 0.5 0 bi +b2 
satisfying | s | = bı | —0.5 | +b2| 0 | +63] 0.5 | = 5] —bı +b; |, and 
t 0 =0.5 0.5 =b + b3 


so we have 2r = bı + b2, 2s = —b, + b3, and 2t = —b2 + b3. From this we get 
bi =r-—s+t,b2 =r +s -—t,and b3 =r+s+t. 
The matrix A = E f i] defines a linear transformation a € Hom(V, W) 


given by 
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0.5 0.5 (0) 
æ: bı | —0.5 | + b2 0 + b3} 0.5 
0 —0.5 0.5 


1 1 
> (3b; + 5b2 + 7b3) g + (4b; + 8b2 + 2b3) H : 


so 


a S = 115r-+95-+59] | ]+c4r-+ 6529] 9 | 
t 


29r + 15s + 3t 
15r+9s+5t |` 


It is very important to emphasize that the matrix representation of a linear trans- 
formation depends on the bases which we fixed at the beginning, and on the order 
in which the elements of the bases are written! If we choose different bases or write 
the elements of a chosen basis in a different order, we will get a different matrix. 
Shortly, we will consider the relation between the matrices which represent a given 
linear transformation with respect to different bases. 

Let V be a vector space finitely generated over a field F, let œ be an endo- 
morphism of V, and let W be a subspace of V which is invariant under œ. As we 
have already seen, the restriction 6 of a to W is an endomorphism of W. Now, let 
B = {v1,..., vg} be a basis for W, which we can expand to a basis D = {v,..., Un} 
for all of V. If ®pp(a@) = [aij] then for all 1 < j < k we have a(v;) = ee ij Vis 
and so aj; = 0 whenever 1 < j < k and k <i <n. Thus we see that the ma- 
Ai Azn 
Y = F{vk41, ..., Un} of V is a complement of W in V. If it too is invariant under a 
then we would also have A21 = O, and so @ is represented by a matrix composed 


trix Ppp(a@) is of the form | i where Ai; = ®gg(ß). The subspace 


of two square matrices “strung out” along the diagonal. From a computational point 
of view, such a representation has distinct advantages. 

Beside addition and scalar multiplication of matrices, we can also define the 
product of two matrices, provided that these matrices are of suitable sizes. Let (K, e) 
be an associative unital algebra over a field F. If A = [vij] E€ Mkxn(K) and B = 
[w jn] E€ Mnx:(K) for some positive integers k, n, and t, we define the matrix AB to 
be the matrix [y;,] E€ Mxx(K) where, for each 1 < i < k and all 1 < h < t, we set 
Yih = ae vij © w jn. For the most part, we will be interested in this construction for 
the case K = F, but sometimes we will have need of the more general construction. 
Note that a necessary condition for the product of two matrices to be defined is that 
the number of columns in the first matrix be equal to the number of rows in the 
second matrix. 
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132 -1 02 -!1 
Example If A = | € M2x3(Q) and B = 1 0 1 —2ļ|e 
—1 2 1 
2 1 0 -3 
5 2 7 -14 ; 
Maxa(Q) then AB = | 5 10 14 |e Maxs (O) but BA is not definea. 


Example If we consider the matrices 


23 2 = 
A= €M2x3(Q) and B=] 1 0} e€M3x2(Q) 
-1 2 1 
2 1 
5 2 —2 -3 -2 
then AB = f al € M2x2(Q) and BA = 2 3 2 | €M3x3(Q). 
3 8 5 
bı Cl 
Suppose that A = [aij] € Mkxn(F)andv= | : | € F”.ThenAv=]| : |e Fk 
by Ck 
where, for each 1 <i <k, we have c; = ae aijbj. Denoting the columns of A 
by u1,..., Un, we see that Av = Dj bjuj. So we conclude that if there exists 
0 
a nonzero vector v such that Av = | : |, then the columns of A must be linearly 
0 


dependent. If every element of F* is of the form Av for some v € F”, then the 
columns of A must form a generating set for F*. 
Let (K,e) be an associative unital algebra over a field F and let n be a pos- 
v1 w1 


itive integer. If v= | : | ad w=] : are elements of K” then v’ w = 


Un Wn 
DOH vi èe wi] € Mıx1ı(K). This is called the interior product of v and w. This 
1 x 1 matrix is usually identified with the scalar Ysi vj è w; E€ K, which we will 
denote by v © w, in a departure from usual notation. ! 

Dually, the exterior product of v and w is defined to be the matrix vw? = [yi;] € 
Mnxn(K), where y;j = vj e wj. We will denote the exterior product of v and w 
by v ^A w. Notice that the exterior product is not commutative, but rather v A w = 
(w A v)”. Exterior products of vectors are encountered far less often than interior 
products, but have important applications in many areas, among them physics (in 
the Dirac model of quantum physics, interior products are called bra-ket products, 
whereas exterior products are called ket-bra products). 


'The usual notation is v- w, but that can cause confusion with the dot product, which we will study 
later, in the case that F = C. For that reason, also, we use the term “interior product” rather than 
the often-seen “inner product”. 
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In particular, we note the following: let K be an algebra over a field F, let 
A € Mexn(K), and let B € M,,;(K). Let vipera be the rows of A and let 
w1,..., W, be the columns of B. Then AB = [c;;], where cij = vi © wj; for all 
l<i<kandall<j<t. 
al ba, 
Let F be a field, let n be a positive integer, let v= | : | and w= 
an Dn 
belong to F”, and let C = [cij] € Mnxx(F). Then the computation of 
pO Cw= yy ai (iy cijb;) requires n? +n multiplications and n? — 1 ad- 
uy yı 
ditions. However, if we can find vectors u = | : | and y= | : | in F” such 


Un Nn 
that C =u A y, then, by the distributive law, v © Cw = D77_) ai (DU j= uiyjbj) = 
Oa aiui) O j= yjbj) and this requires only 2n + 1 multiplications and 27 — 2 
additions. Similarly, if we can find vectors u, u’, y, y’ € F” such that C =u ^ y + 
u’ A y', then the computation of v © Cw requires 4n + 2 multiplications and 4n — 4 
additions. For large values of n, this can result in considerable saving, especially if 
the computation is to be repeated frequently. 


Example Combinatorial optimization is the area of mathematics dealing with the 
computational issues arising from finding optimal solutions to such problems as the 
traveling salesman problem, testing Hamiltonian graphs, sphere packing, etc. The 
general form of combinatorial optimization problems is the following: Let F be a 
subfield of R and let n be a positive integer. Assume that we have a nonempty finite 
(and in general very large) subset S of N” C F”. Usually, the set S arises from the 
characteristic functions of certain subsets of {1,..., n} of interest in the problem. 
ai 
Then, given a vector v= | : | € F”, we want to find min{s © v | s € S}. Note that 
an 
if we consider F not as a subset of R but as a subset of the optimization algebra 
Roo, then the problem becomes one of computing p(a1,..., an), where 


iy 

in 
is a polynomial in several indeterminates over Ræ (polynomials with coefficients in 
a semifield are defined in the same way as polynomials with coefficients in a field). 


Observe that multiplying a k x n matrix by an n x t matrix requires kt(n — 1) 
arithmetic operations. If these numbers are all very large, as is often the case in 
real-life applications of matrix theory, the computational overhead—and risk of ac- 
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cumulated errors due to rounding and truncation—is substantial.* We will keep this 
in mind throughout our discussion, and try to consider strategies of minimizing this 
risk. In this connection, we should note that the product of two matrices has an im- 
portant property: let (K, e) be an associative unital algebra over a field F assume 
that A = [vij] € Mixn(K) and B = [wij] € Mnxt (K), where k, n, and t are posi- 
tive integers. Furthermore, let us pick positive integers 


1=k(1) < k(2)<---<k(p +1) =k, 


1=n(1) <nQ) <:--<n(gt+)D=n, 
1=1() <t(Q2) <---<t(rt+1)=t. 


Uk(i),n(j) ea Uk) nit) 
For all 1 <i < pandall 1 < j < q, let Ajj = $ : 
Vk(i+1I), nG) ++- VkCi+1),n(j+1) 
All =. Aig 
This allows us to write A in block form : E : |. Note that these blocks 
Api --» Apg 
are not necessarily square matrices. In the same way, we can write B as a matrix 
By... Bry Cy... Cry 
: . : |. Then AB = : EA : where, for each 1 <i < p and 
Bai wae Bat Cpt se Cpi 


each 1 < h < t, we have Cin = Dia AijBjn. A sophisticated use of this method 
can substantially decrease the number of operations needed to multiply two matri- 
ces, as we shall see. Moreover, skilled partitioning of matrices can allow us to make 
use efficiently the aspects of modern computer architecture such as cache memories 
to further increase the speed of computation. 

Needless to say, this seemingly odd definition of the product of two matrices 
was not chosen at random. Indeed, it satisfies certain important properties. Thus, 
if (K, e) is an associative unital algebra over a field F, if k, n, t, and p are posi- 
tive integers, and if we have matrices A E€ Mkxn(K), B, Bj, Bo € Mnxt(K), and 
C € Mrtxp(K), then 
(1) A(BC) = (AB)C; 

(2) A(Bı + B2) = AB, + A B3; 

(3) (Bı + B2)C = BC + BoC. 

As a consequence, we see that if B € Mnxt(K) is given, then the function from 
Mkxn(K) to Mgxt(K) defined by At AB is a linear transformation of vector 
spaces. 


2We will often mention large matrices, without being too specific as to what that means. As a rule 
of thumb, a matrix is “large”, and calls for special treatment as such, when it cannot be stored in 
the RAM memory of whatever computer we are using for our computations. Such matrices occur 
in sufficiently-many applications that considerable research is devoted to dealing with them. 
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We also note that if A = [vij] € Mxxn(K) and B = [wjn] E€ Mnxt(K), then 
AT € Mnxk(K) and BT € Mixn(K) so BTAT € M;x4%(K). Indeed, B? A? = 
[yni], where yni = Dj w jh @ vij. Hence, if K is also commutative (and in partic- 
ular if K = F), we have BT A? = (AB). 


Matrix multiplication was first defined by 
the nineteenth-century French mathematician 
Jacques Philippe Binet. It took some getting 
used to; many decades later, the father of as- 
trophysics, Sir Arthur Eddington, still wrote 
“I cannot believe that anything so ugly as mul- 
tiplication of matrices is an essential part of the 
scheme of nature”. 


The definition of matrix multiplication is in fact a direct consequence of the rela- 
tion between matrices and linear transformations, which we have already observed. 
This is best seen in the following result. 


Proposition 8.2 Let V be a vector space of finite dimension n over a field 
F for which we have chosen a basis B = {v1,...,Un}, let W be a vec- 
tor space of finite dimension k over F for which we have chosen a basis 
D={uj,..., we}, and let Y be a vector space of finite dimension t over F, 
for which we have chosen a basis E = {y,..., yt}. [fa c Hom(V, W) and 
B € Hom(W, Y) then ge (Ba) = Ppe(B)Pzd(@). 


Proof Assume that ®gp(a) = [aij] and #peg(p) = [bni]. Then 


t k n 


k n 
Q i U> X 9 cjaijwi and Ba sv 5> X cjbhriaijYn, 


i=1 j=1 h=1 i=1 j=1 


showing the desired equality. 


We can extend the definition of matrix multiplication as follows: let h, k, and 
n be positive integers, and let V be a vector space over a field F. If A = [aij] € 
Mnxk(F) and if M = [v;i] € Mixn(V), we can define AM € Mpxn(V) to be the 
matrix [u;;], where uj; = ae ajjvjr for all 1 <i <h and 1 <t <n. Notice that 
if A, B € Mgxk(F) and if M, N € Mgxn(V) then 
(1) A(BM) = (AB)M; 

(2) A(M + N)= AM +AN; 
(3) (A+ B)M=AM-+ BM. 

In general, and especially when we are talking of actual computations, it is easier 
to work with matrices than with linear transformations, and indeed most of the mod- 
ern computer software and hardware are designed to facilitate easy and speedy ma- 
trix computation. Therefore, given finitely-generated vector spaces V and W over a 
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field F, it is usual to fix bases for them and then identify Hom(V, W) with the space 
of all matrices over F of the appropriate size. The choice of the correct bases then 
becomes critical, and we will focus on that throughout the following discussions. 
Such a choice usually depends on the problem at hand. In particular, the automatic 
choice of canonical bases, when they exist, may not be the best for a given prob- 
lem, and can entail a considerable cost both in computational time and numerical 
accuracy. 


Exercises 


Exercise 404 

Let V be the vector space over R composed of all polynomials in R[X] hav- 
ing degree less than 3 and let W be the vector space over R composed of all 
polynomials in R[X] having degree less than 4. Let a: V —> W be the linear 
transformation defined by 


a:at+bX+cX* tb)+(b+oOX +ato)X*+@tb+o)X?. 
Select bases B = {1, X + 1, X? + X + 1} for V and 
D={X?- X’, X’ -X,X-—1,1} 
for W. Find the matrix gp(a). 


Exercise 405 


Let K = M2x2(R) and let A = : i € K. Let D be the canonical basis of K. 


If a, p € End(K) are defined by a: X + XA and B: Xb AX, find Ppp(a) 
and ®pp(B). 


Exercise 406 


0 1 2 3 
Given the matrix A=]|1 3 4 0 | €e M3,4(R), find the set of all matrices 
3 2 0 1 
0 
1 
0 


Exercise 407 


1 8 
Given the matrix A = | 3 5] € M3 x2(Q), find the set of all matrices 
2. 2 


1 0 0 
BEM 2x3(Q) satisfying AB = | 0 1 O | and find the set of all matrices 
0 0 1 
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ee 1 0 
C € M2x3(Q) satisfying CA = o 1 
Exercise 408 


1 1 2 
; ; 0 1 2 . 
Given the matrix A = L] 3 1 e M4x3(R), find the set of all matrices 
2 1 —1 


1 0 0 
B € M3 x4(R) satisfying BA=|0 1 0 
0 0 1 


Exercise 409 


a+b+c 
Find the matrix representing the linear transformation a : = | a a4 | 


a 
b 
c 

from R? to R? with respect to the bases EH of R? and 


(ikale 


Exercise 410 
Find the set of all matrices A € M4x3(R) satisfying the condition 


0 
0 
0 
0 


O O Om 
ooroeo 
oe aa = =) 


Exercise 411 
Let æ be an endomorphism of R? represented with respect to some basis by the 


0 2 -l 
matrix | —2 5 —2 |. Is aa projection? 
—4 8 -3 


Exercise 412 
Find the real numbers missing from the following equation: 


| es | 
x | 
— 
— * 
* N 
[ase x 
es | 
| 

x | 
— 
* * 
x Ne eme 
Cox *¥ WwW 
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Exercise 413 


a 3a+2b 
Leta:| b | + | —a—c | be an endomorphism of R?. Find the matrix repre- 
c a+ 3b 
1 0 
senting œ with respect to the basis B = -l1 f, O},] 1 of R3. 
0 —1 0 


Exercise 414 
Let D = {v1, v2, v3} be a basis for R? and let a be the endomorphism of R3 


-1 -1 -3 
satisfying Ppp(a)= | —5 —2 —6 |. Find ker(a). 
2 1 3 


Exercise 415 
Let V be the subspace of R[X] consisting of all polynomials of degree less 
than 3 and choose the basis B = {1, X, X?) for V. Let a € End(V) satisfy 


1 1 1 
Čggla)=|0 2 2|. Let D be the basis {1, X + 1,2X? + 4X +3} for V. 
0 0 3 


What is ®pp(a)? 


Exercise 416 

Let œ € End(R?) be represented with respect to the canonical basis by the matrix 
2 2 0 
1 1 2 |. Find areal number a such that œ is represented with respect to the 
1 1 2 


0 0 0 
basis -1l],] aj,jl by the matrix | 0 1 0 
0 0 4 


Exercise 417 
Let V be the subspace of R[X] consisting of all polynomials of degree less than 3 
and let a € End(V) be defined by 
a:aX* +bX +c (at 2b+c)X* + Ga—b)X + (b+2c). 
Find @pp(a), where D = {X? + X + 1, X? + X, X7}. 


Exercise 418 
Find all rational numbers a for which there exists a nonzero matrix B € 


0 0 0 
a 1 1 000 
Max3(Q) satisfying B| 1 1 aļ|= 
lal 0 0 0 
0 0 0 
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Exercise 419 
For which real numbers a does there exist a real number b (depending on a) 


satisfying ki l | i i = E iP 
1 1 a ~10 Ip 
1 b 


Exercise 420 

Let V = RË and let W be the subspace of V generated by the linearly- 
independent set B = {1, x, e*, xe*}. Let ô be the endomorphism of W which 
assigns to each function its derivative. Find gpg (ô). 


Exercise 421 
Let B = {1 +i, 2 +i}, which is a basis for C as a vector space over R. Let a be 
the endomorphism of this space defined by aw : z+> Z. Find ®gpg (æ). 


Exercise 422 
Let F = GF(3) and let a : F? > F? be the linear transformation defined by 
a 
a:| b|| | sa Let B : F? — F* be the linear transformation defined by 
c 
b 


B: B = J bh Find the matrix representing Ba with respect to the canonical 
2a 
bases. 


Exercise 423 
Let œ € End(R*) be represented with respect to the canonical basis by the matrix 
3 -1 0 0 
—1 2 -l 0 
0 -I 2 -1 
0 oO -li 1 
entries of a(v) are nonnegative, show that all entries of v are nonnegative. 


. Given a vector v € R4 satisfying the condition that all 


Exercise 424 

Let V and W be vector spaces over a field F and choose bases {v; | i € 2} 
and {wj | j € A} for V and W, respectively. Let p : 2 x A — F be a func- 
tion satisfying the condition that the set {j € A | p(i, j) 4 0} is finite for each 
i € 2. Leta,:V— W be the function defined as follows: if v = ier iVi, 
where I" is a finite subset of &2 and where the a; are scalars in F, then 
apv) = Vier Ž jes ai p(i, j)wj. Show that ap is a linear transformation and 
that every linear transformation from V to W is of this form. 


Exercise 425 
Let k and n be positive integers, let v € R”, and let A E€ Mkxn(R). Show that 
Av = O if and only if AT Av = O. 
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Exercise 426 
Let A € M3 x2(R) and B € M2x3(R) be matrices satisfying 


8 2 -2 
AB = 2 5 4 
—2 4 5 


Calculate BA. 


Exercise 427 
Find matrices A € M3x2(R) and B € M2x3(R) satisfying 


1 1 1 
AB=|-2 0 —6 
0 1 -2 


Exercise 428 

Let F be a field and let k Æ n be positive integers. Let A, B € Mkxn(F) 
and let a: Mnxk(F) > Mkxn(F) be the linear transformation defined by 
a : C œ> ACB. Under which conditions is œ an isomorphism? 


Exercise 429 

Let F be a field and let n be a positive integer. Let W be a nontrivial subspace 
of the vector space V = Mpnxn(F) satisfying the condition that if A € W and 
B € V then AB and BA both belong to W. Show that W = V. 


Exercise 430 

Let a, b,c,a',b',c' € C satisfy the condition that aa’ + bb’ + cc’ = 2, and let 
a 

A=I-|b [a’ b’ e]. Calculate A2. 
c 


Exercise 431 
Find a nonzero matrix A in M2x2(R) satisfying v © Av =0 for all v € R?. 


Exercise 432 
Let œ be the endomorphism of Q* represented with respect to the canonical basis 


1 0 1 -I 
by the matrix 0 1 i . Find a two-dimensional subspace of Q4 which 
3 13 4 


is invariant under a. 
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Exercise 433 

Let n be a positive integer and let F be a field. For each v, w € F”, consider 
the function Ty w : F” — Fv defined by Ty w : y > (w © y) (this function is 
called the dyadic product function). Show that Ty w a linear transformation. Is 
the function F” —> Hom(F", Fv) defined w +> Ty,w a linear transformation? 


Exercise 434 

Let k < n be positive integers and let F be a field. Given a matrix A E€ Mxxn(F), 
do there necessarily exist matrices B, C E€ M,x.¢(F) satisfying the condition that 
AB=OEMiyx(F) and CA = O € Myyn(F)? 


Exercise 435 

Let A € Mnxn(Q) be a matrix satisfying the condition that if v € Q” is a vector 
all of the components of which are nonnegative, then all of the components of 
Av are nonnegative. Are all of the entries in A necessarily nonnegative? 


Exercise 436 


Find the set of all matrices A in M3x3(R) satisfying A? = 


ooo 
oor 
ooo 


Exercise 437 
Find the set of all real numbers a such that the endomorphism of R? represented 


l aa 
by the matrix | 2 2a 4 | with respect to the canonical basis is an automor- 
3 a 6 


phism. 


Exercise 438 
Find the set of all real numbers a and b such that the endomorphism of R? rep- 


l a b 
resented by the matrix | O a 1 | with respect to the canonical basis is a pro- 
O a 1 


jection. 


Exercise 439 
Let A = [aij] € Mnxn(R) satisfy the condition that for each v € R” there exists 
a vector y € R” all entries in which are nonnegative satisfying Av = v + y. Show 
that A = /. 


Exercise 440 
Let F = GF(2) and let K be the subset of M3x3(F) consisting of O, I, and the 
following matrices: 
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0 0 0 1 O 1 1 1 1 1 
10 0J, |1 10j, |O O 17, |O 1 TY, 
O 1 1 O 1 0 1 1 1 1 1 0 
0 1 0 1 1 0 

1 O 1|, and |1 1 1 

1 0 0 1 0 1 


Show that K, together with matrix addition and multiplication, is a field. What 
is its characteristic? Does there exist an element A of K such that every nonzero 


element of K is a power of A? 


The Algebra of Square Matrices 


We are now going to concentrate on the algebraic structure of sets of the form 
Mnxn(K), where n is a positive integer and (K, e) is an associative unital alge- 
bra over a field F. From what we have already seen, this is again an associative 
unital F-algebra, which will not be commutative if n > 1. The additive identity of 
this algebra is the matrix all of the entries of which equal 0x. The additive inverse 
of a matrix A = [djj] € Mnxn(K) is the matrix [—a;;]. The multiplicative identity 
of Mnxn(K) is the matrix E = [d;;] given by 


ae e ifi=j, 
~~) 0 otherwise, 
where e is the multiplicative identity of (K, e). 
The most important case is, of course, that of K = F. In this case, the additive 
identity is O and the multiplicative identity is the matrix J = [a;;] defined by 


= fi ifi=j, 
lij =} 0 otherwise. 


If K is a vector space of dimension n over F and if B is a basis of K, then it is 
straightforward to verify that the function gg : End(K) > Mpnxn(F) is an iso- 
morphism of unital F-algebras. 

If F is a field and if n is a positive integer then, corresponding to the associa- 
tive F-algebra Mnxn(F), we have the Lie algebra Mnxn(F)~. This Lie algebra is 
called the general Lie algebra defined by F”. 


Example Let F be a field and let A = [aij] € M4x4(F). Then A can also 
ail 412 a13 a4 
a21 422 a23 a4 
431 432 433 434 
a41 a42 a43 a44 
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be written in block form as € Mə2x2(K), where 
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K = Mo) 2(F). Addition and multiplication of matrices are so defined (and not 
accidentally!) that they give the same results whether performed in Ma,.4(F) or in 
M2x2(K). 


Example The set K of all analytic functions from C to itself is clearly an alge- 
bra over C. At the beginning of the twentieth century, G. D. Birkhoff made use of 
matrices in My xn(K) to study the properties of analytic functions. 


With kind permission of the Archives of the Mathematisches Forschungsinstitut Ober- 
wolfach. 

George D. Birkhoff was one of the leading American mathematicians 
at the beginning of the twentieth century, who worked in many areas 
of analysis. 


We begin by identifying some particularly-important square matrices over a 
unital associative F'-algebra K, and with them some significant subalgebras of 
Manxn(K). 

Let (K, e) is an associative unital F-algebra and let n be a positive integer. A ma- 
trix A = [dij] € Mnxn(K) is a diagonal matrix if and only if there exist elements 
Cl, ---,Cn Of K such that 


dos Ci if i = j, 
"10x otherwise. 


The matrices O and E are diagonal. Moreover, the sum and product of diago- 
nal matrices are diagonal matrices, and so the set of all diagonal matrices is an 
F-subalgebra of Myy,(K). If K is commutative (and, in particular, if K = F) 
then this algebra is also commutative. The units of the subalgebra are all diago- 
nal matrices in which each c; is a unit of K (and hence surely nonzero). In this 
case, 


cy... Ox ci ... OK 


-1 
Ox evens Cn Ok see Cy 


Example Let F be a field, let (K, e) is an associative unital F-algebra, and let n be 
a positive integer. A matrix A = [aij] € Mnxn(K) is a scalar matrix if and only if 
there exists a scalar c € K such that a;; = c when i = j and a;; = 0x otherwise. We 
denote this matrix by cE (and, in particular, cI when K = F). Scalar matrices are 
surely diagonal matrices, and both O and E are scalar matrices. Moreover, the sum 
and product of scalar matrices are scalar matrices. If c,d € K then (cE)(dE) = 
(dE)(cE) and if Ox Æc € K is a unit, then (cE)(c7!E) = E. Hence the set of 
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all scalar matrices over F forms an F-subalgebra of Mnxn(F), which is in fact a 
field. The function F ~ My yn(F) defined by c+ cI is a monic homomorphism 
of F-algebras, and so we can identify F with the subfield of all scalar matrices 
of Mnxn(F). Moreover, it is also easy to see that (c/)A = A(cl) = cA for any 
A € Mnxn(F). 


Let (K, e) is an associative unital F-algebra, let n be a positive integer, and let d 
be a positive integer less than n. A matrix A = [vij] € Mnxn(K) is a band matrix 
of width 2d — 1 if and only if v;; = Og whenever |i — j| > d — 1. Thus, the band 
matrices of width 1 are the diagonal matrices. The matrix 


€ Msx5(R) 


CooonNnre 
ooownN 
coooroeo 
ROCCO 
Am. ooo 


is an example of a band matrix of width 3. The set of band matrices of fixed width 
is closed under addition and contains O and 7, but is not necessarily closed under 
multiplication, and so is not a subalgebra of Mnxn(K). However, it is closed under 
scalar multiplication and so is a subspace of the vector space Myx» (K) over F. 

Band matrices over a field are very important for numerical computations, espe- 
cially when d is small relative to n. Of particular importance are band matrices of 
width 3, which are also known as tridiagonal matrices, and have important use in 
the computation of quadratic splines and in the computation of extremal eigenval- 
ues of matrices; they also appear very often in methods of solution of differential 
equations. Tridiagonal matrices have the added advantage of being easily stored in 
a computer, since all we need to do is keep the three diagonals in which nonzero en- 
tries can occur. For example, a tridiagonal matrix in M 1990x1900 (R) has 1,000,000 
entries, of which at most 2998 are nonzero. 

A special type of tridiagonal matrix in My x27 (F'), which we will see again later, 


Ai O ... O 
O An ... O 

is one of the form . . , where the A;; are 2 x 2 blocks. Note 
O O ... Ann 


that this matrix can also be thought of as a diagonal matrix in My x»(K), where 
K = M 2x2(F). More generally, if d and n are positive integers, then any diagonal 
matrix in My yn(L), where L = Maxa(K), is a band matrix of width 2d — 1 in 
Man xdn(K). 

Let (K, e) is an associative unital F-algebra and let n be a positive integer. A ma- 
trix A = [cij] E€ Mnxn(K) is an upper-triangular matrix if and only if cij = Ox 
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1263 7 
03 10 0 
whenever i > j. Thus, the matrix | 0 0 0 O O | € M5,5(R) is upper trian- 
000 0 1 
000 0 4 


gular. The set of all upper-triangular matrices includes the set of diagonal matrices, 
is closed under addition, and contains O and E. Moreover, it is closed under mul- 
tiplication, and so is an F-subalgebra of Mnxn(K). In the case that K = F, we 
see that the dimension of Mnxn(F) as a vector space over F equals a(n +1). 
Upper-triangular matrices arise naturally in many applications, as we will see be- 
low. In a similar manner, we say that a matrix A = [cij] € Mnxn(K) is a lower- 
triangular matrix if and only if cj; = Og whenever i < j. Again, the set of all 
lower-triangular matrices is a subspace of the vector space Mnxn(K) over F and, 
indeed, an F-subalgebra. Note that a matrix A is upper triangular if and only if AT 
is lower triangular. 

A matrix A = [cij] € Mnxn(K) is symmetric if and only if A = AT. That is, 
A is symmetric if and only if cjj = cj; for all 1 <i, j < n. If B is any matrix in 
Manxn(K) then B + Bf is symmetric. If K is commutative and if C € Mxxn(K) 
for any positive integers k and n, then CC’ € Myxx(K) and CTC € Myxn(K) 
are symmetric. If n is a positive integer and F is a field, then v A v is a symmetric 
matrix in Myxn(F) for all v € F”. Diagonal matrices are clearly symmetric and 
the set of symmetric matrices in M,,,(K) is closed under taking sums and scalar 
multiples, and so it is a subspace of the vector space My x»(K) over F. In the case 
K = F, the dimension of Mn xn(F) equals a(n +1). However, the set of symmetric 


2 5 1 
matrices is not closed under products. For example, the matrices A= |5 2 0 
101 
12 1 13 4 5 
and B=]|2 0 0| in M3x3(R) are symmetric, but AB =| 9 10 5] is 
1 0 3 2 24 


not. In fact, in Chap. 13 we will show that if n > 1 then every matrix in My xn(C) 
is a product of two symmetric matrices. 

We note, however, that if A and B are a commuting pair of symmetric matrices 
then (AB)? = (BA)! = AT BT = AB, so AB is again symmetric. 

A matrix A = [cij] € Mnxn(K) is skew symmetric if and only if A = —AT, 
The set of all skew-symmetric matrices in Mn xn(K) is again a subspace of 
Mnxn(K). Note that if F is a field having characteristic other than 2, then any 
matrix A E€ Mnxn(K) can be written as the sum of a symmetric matrix and a skew- 
symmetric matrix, since A = 5(A +AT)+ (A — AT). In one of the examples after 
Proposition 5.14, we saw that this representation is in fact unique. The Lie product 
of two skew-symmetric matrices is again skew-symmetric. 


Example Let n be a positive integer. A matrix A = [aij] € Mnxn(R) is a Markov 
matrix if and only if aij > O for all 1 <i, j < n and X= anj = 1 for each 1 < 
h <n; itis a stochastic matrix if and only if both A and AT are Markov matrices. It 
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is easy to show that the product of two Markov matrices is again a Markov matrix 
and the product of two stochastic matrices in Myx» (R) is again a stochastic matrix. 

Markov matrices arise naturally in probability theory. In particular, if we have 
a system which, at each tick of a (discrete) clock, is in one of the distinct states 
S],.--,8, and if, for each 1 <i, j < n, we denote by p;; the probability that if the 
situation is in state i at a given time ¢ then it will be in state j at time ¢ + 1, the 
matrix [p;;] is a Markov matrix. 


Russian mathematician Andrei Andreyevich Markov made major 
contributions to probability theory at the beginning of the twentieth 
century. 


As we have already pointed out, a matrix O 4 A € Mnxn(K) is not necessarily a 
unit. The units of My »(K) are known as nonsingular matrices; the other matrices 
are singular matrices. By what we have already noted, the product of nonsingular 
matrices is again nonsingular and if A is nonsingular then surely so is A~!. A matrix 
A satisfying A? = I is certainly nonsingular. Such matrices are called involutory 
matrices. 


With kind permission of the Harvard University Archives, HUP. 


These terms were first used by American mathematician Maxime 
Bocher in 1907. He was also the first to popularize the terms “linearly 
dependent” and “linearly independent”. 


a 


b 
Example If a,b € R with b 40 then ie —a) —a 


| is involutory. 


Example We have already noted that if n is a positive integer then a diagonal 
matrix in A = [aij] € Mnxn(C) is nonsingular when all of the diagonal entries 
aii are nonzero. It therefore seems reasonable to conjecture that a matrix will 
be nonsingular if the diagonal entries are all “much greater” than the other en- 
tries. Indeed, this is true in the following sense: A sufficient condition for a ma- 
trix A = [aij] E€ Mnxn(C) to be nonsingular is that for each 1 < i < n we have 
laii| > Žjzi |a;j|. This result is known as the Diagonal Dominance Theorem. 
A proof of this theorem will be given in Chap. 15. 
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Let V be a vector space over F of dimension n and let B be a basis of V. Then 
there exists an endomorphism « of V such that A = ®gg(«). If A is nonsingular 
then there also exists an endomorphism £ of V satisfying A~! = ®gg(). This 
means that J = AAT! = Ppp(a)Pgpa(Pb) = ggap) and so af = oj, and simi- 
larly Ba = 0}. Therefore, œ € Aut(V) and B =a7!. 


Example Let F be a field and let n be a positive integer. If c e F and v, w € F”, 
then the matrix A = I + c(v A w) is nonsingular if and only if the scalar 1 + c(v © 
w) is nonzero. Indeed, direct computation shows that if 1 + c(v © w) Æ 0, then 
AT! = I +d(v ^ w), where d = —c[1+c(v© w)]~! and if 1 + c(v © w) = 0 then 
Av = v + c(v © w)v is the 0-vector, and so A must be singular. 


Example The multiplicative inverse of a “nice” nonsingular matrix may not be 
“nice”. Thus, if A € Mnxn(R) is a nonsingular matrix all of the entries of which 
are nonnegative, it does not follow that all of the entries of A~! are nonnegative. 


1 1 1 
For example, if we choose A=} 1 2 1 | then direct computation shows us that 
1 1 2 
3 -1 -1 
A-!=] -1 1 O |.If A=[a;;] is the n x n tridiagonal matrix with a;; = 2 
-1 0 1 
for all 1 <i <n and aj; = —1 whenever |i — j| = 1, then not only is A`! not 


tridiagonal, but in fact no entries in A~! equal 0, for any n > 1. 


Example If a matrix A € Mnxn(F) can be written in block form [A;;], where Aj; 
is a nonsingular square matrix and A;; = O for i # j, then A is nonsingular, and 
AT! = [Bij], where Bii = AS for each i and B;; = O for each i Æ j. In particular, 
if each Ajj; is involutory, then so is A. 


Example Let n be a prime positive integer. The complex number cn = cos(*2) + 
i sin( 2m) is called a primitive root of unity of degree n, since it easy to check that 
c! = 1 but c” £1 for all 0 < h <n. Therefore, cy! = c”~! for all n. For each 
z € C, let F(z) € Mnxn(C) be the matrix [a;;] defined by aij = z@-DU-D for all 
1 <i, j <n. It is straightforward to show that the matrix F (cn) is nonsingular and, 
indeed, F(c,)~! = iF (cy, 1), The endomorphism p of C” which is represented 
with respect to the canonical basis by the matrix F (cn) is called the discrete Fourier 
transform of C”. This endomorphism is of great importance in applied mathematics. 
An algorithm, known as the fast Fourier transform (FFT), introduced by J.W. Coo- 
ley and John Tukey in 1965, allows one to calculate g,(v) in an order of n log(n) 
arithmetic operations, rather than n*, as one would anticipate. This facilitates the 
use of Fourier transforms in applications. A similar construction is also possible 
over finite fields, and especially over fields of the form GF(p). We will look at this 
example again in Chap. 15. 

A closely-related endomorphism, the discrete cosine transform, is used in defin- 
ing the JPEG algorithm for image compression. 
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With kind permission of the Smithsonian Institution. 


Joseph Fourier was a close friend of Napoleon and served for many 
years as permanent secretary of the Parisian Academy of Sciences. He 
worked primarily in applied mathematics, and developed many im- 
portant tools in this area. John Tukey was a twentieth-century Amer- 
ican statistician who developed many advanced mathematical tools in 
statistics. 


Let (K, e) be an algebra over a field F. A matrix representation of K by matrices 
over F is a homomorphism of F-algebras from K to M,,..(F) for some positive 
integer n. Matrix representations are a very important tool in studying the structure 
of algebras over fields. More generally, a representation of K over F is a homomotr- 
phism of F-algebras from K to End(V) for some vector space V, not necessarily 
finitely generated, over F. 


Example Recall that C is an algebra of dimension 2 over R. The function y : C > 


M2 x2(R) defined by y : a + bi => | d is a matrix representation of C by 


a 
—b 
matrices over R. In fact, this representation is clearly monic and its image is the 
dons : b : 
subalgebra T of M2 2(R) consisting of all matrices of the form È a | , SO y iS 


an R-algebra isomorphism from C to T. 


Let (K, e) be an associative unital algebra over a field F having multiplicative 
identity e, and let n be a positive integer. Let E be the multiplicative identity of 
Mnxn(K) A matrix A = [cij] € Mnxn(K) is an elementary matrix if and only if it 
is of one of the following forms: 

(1) Eng, the matrix formed from E by interchanging the Ath and kth columns, 
where h Æ k; 

(2) En:c, the matrix formed from E by multiplying the Ath column by Og #c € K; 

(3) Ehnk:c, the matrix formed from E by adding c times the kth column to the hth 
column, where h Æ k, where ce K. 

It is easy to verify that matrices of the form Epg and Epng:c are always non- 
singular, with Ee = Ep, and Ea, = Enx.—c. If c is a unit in K, then matrices 


of the form Epn;c are nonsingular, with Ez. = E;,,--1. Thus, if K is a field (and 
in particular, if K = F), every elementary matrix of the form Ep;c¢ is nonsingular. 
We note that the transpose of an elementary matrix is again an elementary matrix. 
Indeed, Ej, = Enk and Ef., = Ep;c for all 1 < h,k <n and Ox # c € K, while 
Efe = Ekn:e for all 1<h#k <nandallce K. 

As the name clearly implies, there is a connection between the elementary au- 
tomorphisms which we defined previously and the elementary matrices. Indeed, if 
K = F and if B is the canonical basis of F”, then Eng = ®(Enk), En:c = P (Eh:c), 
and Enk;c = ® (Enk:c)- 
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Let us see what happens when one multiplies an arbitrary matrix in Myjy;(K) 

on the left by an elementary matrix: 
(1) If BE Mnxn(K) then EnB is the matrix obtained from B by interchanging 
the hth and kth rows of B. Thus, for example, in M4x4(Q) we look at the effect 


1 00 0 5 6 4 1 5 6 4 1 
of Eza: 0 0 0 1 3 2 2 2 = 3 3 2 2 
0 0 1 0 0 4 2 7 04 2 7 
0 1 0 0 3 3 2 2 3 2 2 2 
(2) If Be Mnxn(K) then Ep:cB is the matrix obtained from B by multiplying 
the hth row of B by c. Thus, for example, in M4x4(Q) we look at the effect 
1 000 5 6 4 1 5 6 4 1 
ate 0 5 0 0 3.2 2 2 _ 15 10 10 10 
: 0 0 10 0 4 2 7 0 4 2 7 
0 0 0 1 3.3 2 2 3 3 2 2 


(3) If BE Mu xn(K) then Epk:cB is the matrix obtained from B by adding c times 
the hth row to the kth row. Thus, for example, in M4 x.4(Q) we look at the effect 


1 0 0 0 5 6 4 1 5 6 4 1 

of E132: 0 1 0 0 3.2 2.2 = 3 2 2 2 
"12 0 1 0 0 42 7 10 16 10 9 
000 1 3. 3:72) 2 3, 3 2.2 


Proposition 9.1 Jf F is a field, if n is a positive integer, and if A,B,C, D € 

Mnxn(F) then: 

(1) When A and B are nonsingular, so is AB, with (AB)~!=B-!A7!; 

(2) When AB is nonsingular, both A and B are nonsingular; 

(3) When A and B are nonsingular, AT! + B7! = A`! (B + A)B7!; 

(4) When I + AB is nonsingular, so is I + BA, and (I + BAY =I — 
B(I + AB)7!A; 

(5) (Guttman’s Theorem) If A is nonsingular and if v,w € F” satisfy 
the condition that 1 + w © A™!v Æ 0, then the matrix A +v ^w e€ 
Mhxn(F) is nonsingular and satisfies (A + v ^ w)!=A-!-(1+wo 
Atv)! (AT! [v A wJA7!). 

(6) (Sherman—Morrison—Woodbury Theorem) When the matrices C, D, 
D! + AC7'B, and C + BDA are nonsingular, then (C + BDA)! = 
C! — C7!B(D! + ACIB)! ACT. 


Proof (1) This is a special case of a general remark about units in associative 
F-algebras, which we have already noted. 

(2) Let V a vector space of dimension n over F, and let D be a basis of V. Then 
there exist endomorphisms « and £ of V satisfying A = pp («) and B = pp (8), 
and so AB = ®pp(«ß). Since AB is nonsingular, we know that a6 € Aut(V). 
Then there exists an automorphism y of V satisfying y (æf) = o1 = (aB)y. Then 
(yaæ)ß = 0, =a(By) and so, by Proposition 7.4, we know that both œ and £ are 
automorphisms of V, and hence both A and B are nonsingular. 
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(3) This is an immediate consequence of the fact that A(A7!+ B-!)B=B+A. 
(4) We note that 


(1+ BA)[I — BU + AB)'A] = I + BA — (B + BAB)(I+AB)'A 
=1+BA—B(1+AB)(1+AB)'A 
=I1+BA—BA=I. 


(5) A simple calculation shows us that if x, y € F” satisfy the condition that 
c = l + y Ox is nonzero, then 


(1 — ix a yl) (I +x y]) =I +x^Ay— clix A y] cli A yf 
=I+xay—c'exaA y]=], 


and so (J + [x A yp! =I—c"![xa y]. Therefore, if we set d= 1+ wọỌO Alvu 
then 


(AtvAw) = ee [va wl) = (1+ A'i a w) AT! 
I— d! (A7 '[vA w])]A7! 
= A7! — d7! (A7 [v A w]A™'), 


as required. 

(6) First, note that 7 + C7!B DA = C7! (C + BDA) and so, by (1), this matrix is 
nonsingular as well. By (4), ( + C7!B DA)! = I — C7! B(I + (DAC™!B)DA, 
and so 

(C+ BDA)! =[C(I+C7'BDA)] 
— C7'B(I + pac-'B) 'DA]C'! 


[ 
= C7! — C7! B[D' (I + DAC! B)| ‘Act! 
=C 


‘Cao + ACIB) Act, 


as required. 


With kind permission of Nurit Guttman. 


Louis Guttman was a twentieth-century American/Israeli statistician 
and sociologist who developed many advanced mathematical tools for 
use in statistics. The Sherman—Morrison—Woodbury Theorem was in 
fact first published by British aeronautics professor W.J. Duncan, but 
is named after the twentieth-century American statisticians Jack Sher- 
man, Winifred J. Morrison, and Max Woodbury who used it exten- 
sively. 
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Guttman’s Theorem is important in the following context: assume we have cal- 
culated AT! for some square matrix A and now we have to calculate B~!, where 
B differs from A in only one entry. With the help of this result, we can make use of 
our knowledge of A~! to calculate B~! with relative ease and speed. The Sherman- 
Morrison—Woodbury Theorem has similar uses. 

In particular, we note from Proposition 9.1 that if A, B € My ,(F) then AB is 
nonsingular if and only if BA is nonsingular. We should also note that if A, B € 
Maxn(F) then BT A? = (AB)’, and so if A is a nonsingular matrix and B = AT! 
then AB = J, and so BT A? = IT = I. Thus A? is also nonsingular. Moreover, this 
also shows that (AT)! = (A7!)F for every nonsingular matrix A E€ Myxn(F). 


Proposition 9.2 Let F be a field, let n be a positive integer, and let A, B € 
Mnxn(F), where A is nonsingular. Then there exist unique matrices C and D 
in Mnxn(F) satisfying CA = B = AD. 


Proof Define C = BA~! and D = A`! B. Then surely CA = B = AD. If C’ and 
D' are matrices satisfying C'A = B = AD’ then C’ = (C'A)A~! = BA~! =C and 
D! = A~!(AD’) = A7! B = D, and so we have uniqueness. 


Example The matrices C and D in Proposition 9.2 need not be the same. For ex- 


ample, if A, B € M2 2(R) are defined by A = ‘ al and B = f i then 


1 -3 1 -3 —8 -3 
-1_ = = 
are |e | ae S| | 


Proposition 9.3 Let F be a field, let n be a positive integer, and let 

A = [aij] E€ Mnxn(F). Then the following conditions are equivalent: 

(1) A is nonsingular; 

(2) The columns of A are distinct and the set of these columns is a linearly- 
independent subset of F”; 

(3) The rows of A are distinct and the set of these rows is a linearly- 
independent subset of M\xn(F). 


Proof (1) & (2): Denote the columns of A by yj,..., Yn. Let V = F” and let 

B = {v1,..., Un} be the canonical basis of V. If two columns of A are equal 

or if the set of columns is linearly dependent, there exist scalars c1, ..., Cn, not 
Cl 0 

all equal to 0, such that A] : | = )°_,ciyj = | : |. But if (1) holds, then 
Cn 0 
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Cl 0 0 

f= AT! : | = | : |, which is a contradiction. Therefore, (2) holds. Con- 

Cn 0 0 
versely, assume (2) holds. Then the endomorphism a of V given by vt Av isa 
monic and so an automorphism of V. But A = gpg (œ), and so, as we have seen, A 
is nonsingular. 


(1) & (3): This follows directly from the equivalence of (1) and (2), given the 
fact that a matrix A is nonsingular if and only if A? is nonsingular. 


Example Let F bea field and let n > 1 be an integer. If v, w € F”, then the columns 
of vA w E€ Mr xn(F) are linearly dependent and so v A w is always singular. 


Example If F is a field and if U = [uij] E€ Mnxn(F) is an upper-triangular ma- 
trix satisfying the condition that u;; 4 0 for all 1 <i <n then, by Proposition 9.3, 
it is clear that U is nonsingular. We claim that, moreover, U~! is again upper 
triangular. Let us prove this contention by induction on n. It is clearly true for 
n = |. Assume therefore that n > 1 and that we have already shown that the in- 
verse of any upper-triangular matrix in M (,—1) x (n—1)(F) is upper-triangular. Write 
077 
A y n-1 : 
U = p pa where A € Man-1)xín-1) (F), y € F" , and z= | : | . As- 
0 


_»_| B x 
sume that U~ = E b 
AB+yAw=I, Ax+by= Zz! unnw! = z, and unnb = 1, so we must have 
b= uzl #0 and w? =z. Therefore, y ^A w = O and so B = A~!. By hypothesis, B 
is upper triangular and so U~! is again upper triangular. A similar argument holds 
for lower-triangular matrices. 


|: where B € M (n—1)x(n—1) (F) and w, x € F"-!. Then 


Proposition 9.4 Let F be a field and let n be a positive integer. A matrix in 
Mnxn(F) is nonsingular if and only if it is a product of elementary matrices. 


Proof Since each of the elementary matrices is nonsingular, we know that any prod- 
uct of elementary matrices is also nonsingular. Conversely, let A = [a;;] be a non- 
singular matrix in My xn(F) and let B = [b;;] be A~!. Then B is also nonsingular 
and so, by Proposition 9.3, the columns of B are distinct and the set of columns is 
linearly independent in F”. In particular, there exists a nonzero entry bj, in the first 
column of B. Multiply B on the left by Ep1 to get a new matrix in which the (1, 1)- 


entry nonzero. Now multiply it on the left by E).., where c = Des , in order to get 
I is. * 


* * * 


a matrix of the form | . . . l . |. Now let 1 <t <n, and let d (t) be the ad- 
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ditive inverse of the (t, 1)-entry of the matrix. Multiplying the matrix on the left by 
Ey\.a), we will get a matrix with 0 in the (f, 1)-entry and so, after this for each such 
t, we see that a matrix B’ = C, B, where C, is a product of elementary matrices, and 


| OE gp 
O * ias F 

which is of the form | . . . |. This matrix is still nonsingular, since it is 
O so 


a product of two nonsingular matrices, and so its columns are distinct and form a 
linearly-independent subset of F”. Therefore, there exists a nonzero entry bj,, in the 
second column, with h > 1. Repeating the above procedure, we can find a matrix 


C2 which is a product of elementary matrices and such that C2C1 B is of the form 
1 0 * 3. F 


Qt fh) °F erg T 
eee . Continuing in this manner, we obtain matrices C1, ..., Cn, 
0 O0 * na * 


each of them a product of elementary matrices, such that C, ---C; B = I. Therefore, 
C,-::C, = BT! =A, as we wanted to show. 


Example Let F bea field and let n be a positive integer. Every permutation z of the 
set {1,...,} defines a matrix Az = [aij] E€ Mnxn(F) by setting aij = Lif j = x (i) 
and a;; = 0 otherwise, called the permutation matrix defined by x. This matrix is 
clearly a result of multiplying 7 by a number of elementary matrices of the form 
Enk, and so is nonsingular. 


The order of multiplication given in Proposition 9.4 is not unique. Indeed, we 
claim that it is possible to write any nonsingular matrix A E€ Mnxn(F) in the form 
PC, where P is a permutation matrix and C is a product of elementary matrices of 
the form E£;., and E;;.-. To see how this is done, we note that if 1 < i, h, k < m and 
if c € F then Eic Eng = Eng Ei:c if i ¢ {h, k} and Ep.c Enk = Enk Ek:c and a similar 
result holds for elementary matrices of the form E;;,- and Eng. Thus, one by one, 
the elementary matrices the form Epg can be “moved to the left” until we obtain the 
desired decomposition. 

Proposition 9.4 allows us to construct an algorithm for computing A~! when 
A is a nonsingular matrix in My ,(F). First of all, we construct the matrix 
[Z A] € My x2n(F) and on this matrix we perform a series of elementary opera- 
tions, namely operations which are the result of multiplying it on the left by elemen- 
tary matrices, which bring the right-hand block into the form 7. Then the left-hand 
block is A~!. To calculate A`! by this method, we use n? — 2n? + n additions and 
n? multiplications. 
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12 3 
Example Consider the matrix A= | 2 3 0 | €e M3x3(Q). Therefore, we begin 
0 1 2 
3 
0 
2 


1001 2 
with the matrix | 0 1 0 2 3 € M3x6(Q). Then we 
0010 1 
100 1 2 3 
(1) Get | —2 1 0 O —1 —6 | after multiplying the first row by —2 and 
0 0 1 0 1 2 


adding it to the second row; 


1 0 0 1 2 3 
(2) Get} —2 1 0 0 —1 -—6 | after adding the second row to the third row; 
—2 110 0 -4 
1 0 0 1 2 3 
(3) Get | 2 -1 0 O 1 6 after multiplying the second row by 
0.5 —0.25 -0.25 0 0 1 
—1 and then multiplying the third row by —0.25; 
1 0 0 1 2 3 
(4) Get} —1 0.5 1.5 0 1 O | after multiply the third row by —6 and 
0.5 —0.25 -0.25 0 0 1 
adding it to the second row; 
—0.5 0.75 0.75 1 2 0 
(5) Get | —1 0.5 15 0 1 | after multiplying the third row by 
0.5 —0.25 —0.25 0 0 1 
—3 and adding it to the first row; 
1.5 —0.25 -2.25 1 0 
(6) Finally, get | —1 0.5 15 0 1 
0.5 —0.25 —0.25 0 0 
row by —2 and adding it to the first row. 


0 
O | after multiplying the second 
1 


6 -1 -9 
Therefore, we see that A~! = ; —4 2 6 
2 -1 -1 


Example When one uses computer to compute matrix inverses, one must always 
be aware of hardware limitations. For example, one can show that Nievergelt’s 

f 888445 887112 
matrix A = 


887112 l € Mo2x2(Q) is nonsingular, while the matrix 


B=A-— k a where c = s150 (which is approximately 2.818 x 1077) is 


not. Nonetheless, a computer or calculator capable of only 12-digit accuracy cannot 
differentiate between the two. 


Example For each positive integer n, let H, € Mnxn(Q) be the matrix [a;;] in 
which aij = = This matrix is called the n x n Hilbert matrix. Hilbert matrices 
are all nonsingular but, while their entries all lie between 0 and 1, the entries in their 
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inverses are very large. For example, Hy : equals 


36 —630 3360 —7560 7560 —2772 
—630 14700 —88200 211680 —220500 83160 
3360  —88200 564480 —1411200 1512000 = —582120 
—7560 211680 —1411200 3628800 —3969000 1552320 
7560 —220500 1512000 —3969000 4410000 —1746360 
—2772 83160 —582120 1552320 —1746360 698544 


Therefore, these matrices are often used as benchmarks to judge the efficiency and 
accuracy of computer programs to calculate matrix inverses. In particular if the com- 
puter we are using has only 7-digit accuracy, it is reasonable to assume that we will 
have a 100% error in computing Hg i, 


With kind permission of the Archives of the Mathematisches Forschungsinstitut Ober- 
wolfach. 

German David Hilbert was one of the foremost mathematicians in the 
world at the beginning of the twentieth century. He and his students 
were among the first to study infinite-dimensional vector spaces. 


It is sometimes possible to use a representation of a nonsingular matrix A in block 
form in order to calculate A~!. Indeed, suppose that A € Mnxn(F) is a matrix 


which can be written in block form ke ral where A11 E€ Mgxk(F). If A11 
21 22 


and C = An — A21 A7} A12 are both nonsingular, then A is also nonsingular, with 


Ata I -Al An Ai O I O 
O I o cC!]|-AnAy ol 


Similarly, if A22 and D = Aj; — A 12453 Ari are both nonsingular, then A is also 
nonsingular, with 


pote I o|[pD-! o I ApAy 
-Ay An O|| o az jlo I f 


The matrices C and D are, respectively, the Schur complements of A;; and A22 
in A. These conditions, however, are sufficient but not necessary for A to be non- 
singular, as the following example shows. 
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With kind permission of the Archives of the Mathematisches Forschungsinstitut Ober- 
wolfach. 


Issai Schur was a twentieth-century German mathematician who is 
known primarily for his work in group theory. 


1 0 0 0 
0 0 1 0 
Example The matrix A = Fi = € M4x4(Q) is nonsingular, de- 
lo of [o 4 


spite the fact that all of the given 2 x 2 blocks are singular. 


It is important to make clear, however, that it is hardly ever necessary, in appli- 
cations, to actually compute the inverse of a nonsingular matrix. One is more likely 
to have to compute a product of the form A~! B, which can usually be done without 
explicitly computing AT! first. 

Let F be a field and let k and n be positive integers. Two matrices 
B,C € Mkxn(F) are equivalent if and only if there exist nonsingular matrices 
P € Mgxk(F) and Q E€ Mnxn(F) such that P BQ = C. This is, indeed, an equiv- 
alence relation on Mxy.n(F) since: 

(1) IBI = B for each such matrix B, showing that B is equivalent to itself; 

(2) If PBQ =C then P7!CO7! = B; 

(3) If PBQ =C and P'CQ' = D' then (P’ P)B(QQ’) = D, where we note that 
both P’P and QQ’ are again nonsingular. 

Similarly, we say that B and C are row equivalent if and only if there exists a non- 
singular matrix P € Mgxk(F) satisfying P B = C, and we say that B and C are col- 
umn equivalent if and only if there exists a nonsingular matrix Q € Mnxn(F) satis- 
fying BQ = C. Both of these relations are also equivalence relations on Mxxn(F), 
and it is clear that if B and C are row equivalent then they are equivalent (take 
Q = I) and if they are column equivalent then they are equivalent (take P = I). 

Equivalence of matrices is a very strong concept. Indeed, it is easy to show that 


any matrix B € Mx xn(F) is equivalent to one which is in block form | A a | 


Therefore, it is more useful to consider row equivalence of matrices as our basic 
tool. 

Now let V be a vector space of dimension n over a field F and choose bases B = 
{v1,..., Un} and D = {wj,..., Wn} of V. For each 1 < j <n there exist elements 
qij» ---,qnj of F satisfying wj = Jai qij vi. By Proposition 9.3, we know that the 
matrix Q = [qij] is nonsingular. If v = } `; aivi = } 4—1 bj w; is an element of V, 
then we see that v = D bjwj = D bj (ja qijvi) = PDT OD qijbj)vi 
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and so we must have a; = ei qijbj for all 1 <i <n. Thus we see that 


ay by 


an bn 
The matrix Q is called the change-of-basis matrix from D to B. 


Example Let F be a field, let n be a positive integer, and let V be the subspace of 
the vector space F[X] made up of all polynomials of degree at most n — 1. Then 
dim(V) = n, and it has a canonical basis B = {1, X,..., X"~!}. Let c1, ..., Cn be 
distinct scalars, and for each 1 < i < n, consider the polynomial 


1 
pi(X) =] | —_(X -ejeV. 


mee 
jai H 


This polynomial is called the ith Lagrange interpolation polynomial, and we will 
return to these polynomials below in another context. It is clear that 


1 ifi=j, 
Pi (cj) = E otherwise. 
Thus, for example, if n = 4 and if we choose c1 = 1, c2 = 3, c3 = 5, and c4 = 7, we 
obtain 
71 35 


(X) = Läpp X+ 
tae 16 48°" 16’ 


X)=—X3 x X l 
Mee ig ig 1G 
(x)= Cpu yy 
16° "I 16 I6 
23 5 
X)= — X?’ xX? X 
palX) = 75 io tR ~(OG 
Returning to the general case, we see that the set D = {p,(X),..., pn(X)} of 


Lagrange interpolation polynomials is linearly independent since, if we have 
>i) 4 pi(X) = 0, then for each 1 < h < n we have an = }`;_] ai pi (Cn) = 0. 
Therefore, D is also a basis of V. If q(X) is an arbitrary polynomial in V then 
there exist scalars a1, ... , an satisfying q (X) = )~/_, ai pi(X). Again, this implies 
that a; = q (ci) for alli. In particular, if q (X) = X* we see that X* = De ck pi (X). 


1 ca cl 
1 co c3 
Therefore, the change of basis matrix from D to Bis | . . ` |. A matrix 
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of this form is called a Vandermonde matrix, and such matrices are always nonsin- 
gular. 


With kind permission of ETH-Bibliothek Zurich, Image 
Archive (Lagrange). 

Joseph-Louis Lagrange was one of the applied 
mathematicians who surrounded Napoleon, and 
his book on analytical mechanics is considered 
a mathematical classic. Alexandre-Théophile 
Vandermonde was an eighteenth century French 
chemist and mathematician who studied determi- 
nants of matrices. Vandermonde matrices do not appear in his work, and it is not clear why 
they are named after him. 


Lagrange interpolation allows us to represent a polynomial p(X) of degree less 
than n in a computer not by its list of coefficients but rather by a list of its values 
p(aı),..., p(an) at n preselected elements of F. Such representations can be used 
to obtain algorithms for rapid multiplication of polynomials, especially in the case 
the field F is finite (having n elements, of course). Indeed, if p(X) and q(X) are 
polynomials in F[X] of positive degree satisfying deg(p) + deg(q) =h <n, then 
p(X)q(X) is the unique polynomial t (X) of degree h satisfying t (ai) = p(aj)q (ai) 
forall 1<i<h+1. 


Let us now return to the matter of change of basis, and now let us as- 
sume that we have a linear transformation œ : V —> Y, where V is a vec- 
tor space of dimension n over a field F and Y is a vector space of dimen- 
sion k over F. We have bases B = {v1,..., Un} and D = {w1,..., Wn} of V. 
Choose a basis E = {yj,..., yk} of Y. Then gg(a) is a matrix C = [cij]. If 
Q = [qij] is the change of basis matrix from D to B then for each 1 < j <n 
we have o(wj) = a(S 3i gnjvn) = X} dnja (vn) = ry dn Ook Cini) = 
ae Cingnj) Yi» and so Ppz(a) = CQ, showing that #pgz(«) and C are 
column equivalent. In the same manner, if we have another basis G = {z1,..., zx} of 
Y and if P = [pij] is the change of basis matrix from E to G, then zj = yy Dij Yi 
for all 1 < j < k. If ®gg(a) is the matrix C’ = [c;;], then for all 1 < j < n we 


have (vj) = has erjzn = Dhar enj Lint PinYi) = Xi- par Piheng)yi and 
this equals D$ cijyi implying C = PC’, and so C’ = P~!C. Thus ®gg(a) 
and C are row equivalent. If we put both of these results together, we see that 
Ppg(a) = P-'@zgr(a)Q, and so ®pg(a) and gg («) are equivalent. 


Example Let a : R? —> R? be the linear transformation given by 


a 
:Ib |m ath 
Qa: pase 


Ly © 
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1 1 ıl! 
E Joar =:| 


voor [Jep e eE 


l E. Note that P~! gg (a) Q = pala). 


Example We will now see an application of linear algebra to calculus. Let V be the 
vector space over R consisting of all infinitely-differentiable functions f € RÈ, and 
let ô € End(V) be the differentiation endomorphism. 

(1) If a and b are given real numbers, not both equal to 0, then the functions 


(2 


wm 


fo: xt e™ sin(bx) and fi : x œ> e™ cos(bx) belong to V and the subspace 

W = R{ fo, fi} of V is invariant under ô. The restriction of ô to W can be 

represented with respect to the basis { fo, fı} of W by the nonsingular matrix 
= a —b i = io g 2 —1 a b 

A= f Al It is easy to check that A™ = (a +b ) E al There- 

fore, 


1 
J fot) dt = 8&7! ( fo) = (<p) a — 6 and 


1 
f| hoaz (m bhoan 


The functions gg : x |> xe, gı : x |> xe*, and gz: xb e“ all belong to V 
and the subspace Y = R{go, g1, 82} of V is invariant under ô. The restriction 
of ô to Y can be represented with respect to the basis {g0, g1, g2} of Y by the 


1 0 0 1 0 0 
nonsingular matrix B= | 2 1 0 |. Since B7! =| —2 1 0 |, we see 
O 1 1 2 -1 1 


that 


foa = 87! (g0) = go — 2g1 + 282, 
[s@dr=olen=e1- 2 and 


[awa 5-(g2) = 82. 


Let us turn to problems connected with the implementation of this theory. Let 


F be a field and let n be a positive integer. Let A = [a;j] and B = [bij] be- 
long to Mnxn(F) and let C = AB. In order to calculate each one of the n? en- 
tries in C, we need n multiplications and n — 1 additions/subtractions, and so to 
calculate C we need n? multiplications and n? — 2n? + n additions/subtractions. 
Putting this in another way, the total number of operations needed to calculate 
AB from the definition is on the order of n°, where c = 3. If n is very large, 
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this can entail considerable computational overhead and leaves room for the in- 
troduction of significant error due to roundoff and truncation in the course of the 
calculation. It is therefore very important to find a more sophisticated method of 
matrix multiplication, if possible. One such method is the Strassen—Winograd algo- 
rithm. 


With kind permission of Volker Strassen (Strassen); With 
kind permission of the Department of Computer Science, 
City University of Hong Kong (Winograd). 

Variants of this algorithm were discovered by 
the contemporary German mathematician Volker 
Strassen and the contemporary Israeli mathe- 
matician Shmuel Winograd who later served as 
director of mathematical research at IBM. 


To illustrate the Strassen—Winograd algorithm, let us first begin with the special 
case n = 2. First, calculate 


po = (a11 +a12)(bi11 + b12), pr = (a11 +a22)b11, p2 = a1 (b12 — b22), 
p3 = (a21 — a11)(bi1 + b12), pa = (a11 +. 412)b22, ps = a22(b21 — 511), 
Po = (412 — 422)(b21 + b22), 

and then note that C = Po + p5 — p4 + po p2 + p4 |- In is calcu- 


Pit Ps Po — Pit p2 + P3 
lation, we used 7 multiplications and 18 additions/subtractions (Winograd’s variant 


of this algorithm uses only 15 additions/subtractions, but these are more interdepen- 
dent, and so the algorithm is less amenable to implementation on parallel comput- 
ers) instead of 8 multiplications and 4 additions/subtractions. In the early days of 
computers, when multiplication was several orders of magnitude slower than addi- 
tion, this in itself was a great accomplishment. If n = 4, we write our matrices in 
block form: A = Ee | and B= fe “al where each block is a 2 x 2 
A2 An Ba Bn 
matrix. We now calculate 2 x 2 matrices Po, ..., P6 and then construct C = AB 
as above. To do this, we need 49 multiplications and 198 additions/subtractions, as 
opposed to 64 multiplications and 46 additions/subtractions if one goes according 
to the definition. We continue recursively. If n = 2", then the number of multipli- 
cations needed is M (h) = 7” and the number of additions/subtractions needed is 
A(h) = 6(7} — 4") and so M (h) + A(h) < 7"*!, (If n is not a power of 2, we can 
add rows and columns of 0’s in order to enlarge it to the desired size.) Thus, we see 
that the number of arithmetic operations needed to calculate AB is on the order of 
n°, where c < log, 7 = 2.807... and so, for large n, we have a definite advantage 
over multiplication following from the definition. Using even more sophisticated 
techniques, it is possible to reduce the number of arithmetic operations to the order 
of n°, where c < 2.376..., as was done by Winograd and Coppersmith in 1986. 
Recent results by American mathematicians Chris Ulmas and Henry Cohn, using 
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sophisticated group-theoretic techniques, suggest that c can be reduced still further, 
but their methods are not, as yet, practical for all but matrices of immense size. 

For sparse matrices—namely matrices in which a very large majority of the en- 
tries are 0—these algorithms can be combined with other sophisticated techniques 
to produce even faster multiplication. If the matrices are in My x»(F) but have no 
more than n nonzero entries, then one can multiply them in an order of n2+*”™ 
operations, where k(n) > 0 as n —> oo. 

The size of matrices for which the Strassen—Winograd algorithm is significantly 
faster than the regular method depends, of course, on the particular hardware on 
which it is being used. The Strassen—Winograd algorithm can also be modified to 
multiplication of matrices which are not necessarily square. 

Unfortunately, the Strassen—Winograd algorithm is no less susceptible to round- 
off and truncation errors than the regular algorithm. On a computer with seven-digit 
accuracy, the product 


211 2 3 4 50 0.32 0.0023 421 

1 2 3 4 60 0.023 0.033 982 
0.001 0.032 0.043 0.044 23 0.032 0.03 623 
311 0.0032 1233 0.0324 33 0.043 0.022 44 


10871 67.834 0.7293 92840 
371 0.634 0.2463 4430 
4.411 0.0043 0.0033 60.57 
43910.3 138.977 37.7061 899094.0 
matrix multiplication, whereas, using the Strassen—Winograd algorithm, we ob- 

10871 68.54 0.6294 92840 


; 370.9 1.0 0.2463 4430.18 . 
tain . This problem can be overcome to 
4.411 0.0043 0 62.0 


43910.3 139.047 37.7 899095.0 
some extent by stopping the recursion in the Strassen—Winograd algorithm early, 
and doing the bottom-level matrix multiplication using the ordinary method. An- 
other disadvantage of this algorithm is that it requires a much larger amount of 
scratch memory space to perform its calculations. 

There are other tricks that can be used to reduce the computations necessarily 
in matrix multiplication. For example, if n is a positive integer and if A, B,C, D € 
Maxn(R), then the matrix product (A +iB)(C + iD) in Mnxn(C) can be calcu- 
lated using only three matrix multiplications in Mpnxn(R), rather than the expected 
four, by noting that 


equals using the ordinary method of 


(A+iB)(C +iD) = AC — BD +i[(A+ B)(C + D) — AC — BD]. 


If we have a parallel-processing computational system at our disposal, ma- 
trix multiplication can be done much more rapidly. There exist parallel algo- 
rithms to multiply two n x n matrices in an order of log(n) time, on the con- 
dition that we have n? processors working in parallel. Given the availability of 
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such parallel computational power, one can also invert a nonsingular n x n ma- 

trix in an order of log? (n) time. The first such algorithm was developed by Las- 

zlo Csanky in 1977, though this algorithm has the disadvantage of being wildly 
unstable. 

Again, we keep in mind that real and complex numbers are represented in a 
computer by approximations having a limited degree of accuracy. The longer cal- 
culations become, the error due to roundoff and truncation increases and limits the 
correctness of the calculations. It is possible to reduce the effect of roundoff and 
truncation errors as much as possible. Let us recall how our algorithm for inverting 
a matrix A worked: 

(1) We formed the matrix [J A] = [b;;]; 

(2) We interchanged the first row which one of the rows below it, if necessary, such 
that bj n+1 #0; we then multiplied this row by b, 4, 50 that this element is 
now equal to 1, and we subtracted multiples of this row from the rows below it, 
in order to make b; n+1 equal to 0 for all 1 <i <n. 

(3) We now go iterate this process for the elements bj n+n, where h = 2,3,... and 
so forth. If we cannot do it, i.e., if there exists an h such that bi n+n for all 
h <i <n, the matrix A is nonsingular. Otherwise, at the end of the process, we 
have brought the matrix to the form [AT 7]. 

The elements ba ,n+n are called pivots of the algorithm. If we are working over 
R or C, we can minimize roundoff and truncation errors, to some extent, by mak- 
ing sure that each time we interchange rows we choose to bring into the pivot po- 
sition a nonzero number having maximal absolute value. This strategy is known 
as partial pivoting. We could do better by also interchanging columns in order to 
bring into the pivot position ba,n+n the element Dj; (h < i, j < n) having maxi- 
mal absolute value. This strategy is known as full pivoting; it requires a certain 
amount of computational overhead on the side so that the columns can be returned 
to their proper positions at the end of the algorithm. Although there are matrices 
so pathological that full pivoting rather than partial pivoting is needed in order to 
invert them, most experts believe that it is not worth the effort and the computational 
overhead and that for such matrices one should use other methods altogether. Par- 
tial pivoting also does not work well on parallel or systolic-array computers, since 
it requires many nonlocal data movements. Several variants of pivoting strategies 
for matrices having specific structures have, however, been developed and are in 
wide use. 

Indeed, let us now consider another method. It is clearly easier to invert a nonsin- 
gular upper-triangular or lower-triangular matrix—namely a matrix in one of these 
forms all of the diagonal elements of which are nonzero. Therefore, our job would 
be much easier if we could write A in the form LU, where L is lower triangular 
nonsingular and U is upper triangular nonsingular, for then A~! = U~!L~!. This 
is not always possible. For example, one can see that there is no way of writing the 
matrix |: D| € M2x2(R) in this form. However, it is always possible to write A 
in the form LU when A equals a product of elementary matrices of the form Ej., 
and E;;,- only. 
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How can this be done? Assume that A = [a;;], U = [uij], and L = [v;;] and 
that A = LU, where U is upper-triangular nonsingular and L is lower-triangular 
nonsingular. Then for each 1 <i, j <n we have aij = > VinUnj- In each of L 
and U there are only in? + n) entries which can be nonzero and so our problem 


is one of solving n? nonlinear equations in n? + n unknowns. This means that we 
can allow ourself to choose the value of n of these variables arbitrarily, and we will 
do so by insisting that v;; = 1 for all 1 <i < n. Now we have a system of n? +n 
nonlinear equations in n? + n unknowns, which can be solved by a method known 
as Crout’s algorithm: 

(1) First set vj; = 1 forall 1 <i <n; 

(2) Forall 2 < j <n andall 1 <i < j, first calculate u;; = aj; — Sa Vinunj and 


1 j-l oe 
then vij = zy (aii = ei VihUhj) for all J<tdn. 


With kind permission of the National Portrait Gallery (Tur- 
ing); With kind permission of Sir Peter Swinnerton-Dyer 
(Swinnerton-Dyer). 
The LU method was devised by the British mathe- 
matician Alan Turing who is better known as the 
founder of automata theory and one of the fathers 
of the electronic computer. It appears implicitly 
ae in the work of Jacobi on bilinear forms. The first 
computer algorithm to compute LU factorizations using partial pivoting was described by 
the contemporary British mathematicians D.W. Barron and Sir Peter Swinnerton-Dyer. 
Prescott Crout was a twentieth-century American mathematician. 


We note that if A is a nonsingular matrix which can be written in the form LU, 
where L = [v;;] is a lower-triangular nonsingular matrix satisfying vj; = 1 for all 
1 <i <n and U =[u;;] is upper-triangular and nonsingular, then this factorization 
must be unique. Indeed, assume that L; U1 = L2U2 where the Lp are lower triangu- 
lar matrices with 1’s on the diagonal, and the U, are nonsingular upper-triangular 
matrices. Then L, IL; = UU i l Since the product of lower-triangular matrices is 
lower triangular and the product of upper-triangular matrices is upper triangular, 
this matrix must be a diagonal matrix. But then L} IL = Í and so Lı = L2 and that 
implies that U; = U2, proving uniqueness. 


Example Some singular matrices may also be written in the form LU, but for them 
the above uniqueness result is no longer necessarily true. For example, 


1 -l 2 1 0 0 1 -l 
—1 1 —1|=|—1 1 0 0 0 1 
2 -2 4 2 b 1 0 0 —b 


for any scalar b € R. 


As was previously remarked, not all nonsingular matrices can be written in the 
form LU. However, we have already noted that any nonsingular matrix A can be 
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written in the form PC, where P is a permutation matrix and C is a product of 
elementary matrices of the form E£;., and E;;.- and C can be written in the desired 
LU form. 


0 1 1 -3 
; : —2 4 1 4 
Example It is easy to verify that 00 0 f= PLU, where P = 
3 1 1 0 
0 100 1 0 0 0 
1 0 0 OJ. : ; O 1 0 07. ; 
000 11'88 permutation matrix, L = -3 71 0\3 lower trian- 
0 0 10 0 0 0 1 
—2 4 1 4 
gular, and U = : k E: z is upper triangular. 
0 0 0 1 


In general, the problem of factorization of a square matrix into a product of ma- 
trices of a more desirable form is one which arises often in computational matrix 
theory, and many techniques have been developed to facilitate such computations. 
One method, for example, is to associate with any matrix A = [a;;] € Mnxn(F) an 
undirected graph T4 the vertices of which are {1,...,} and in which there exists 
an edge connecting i and j if and only if a;; 40 or aj; #0. If this graph has nice 
structure—if it is a tree, for example—then this structure can be exploited to pro- 
duce efficient factorization algorithms for A, as has recently been shown by Israeli 
computer scientist Sivan Toledo. 


Exercises 


Exercise 441 


1 3 
Let F = GF(5). Calculate | 2 1 
1 2 


Ww” = = 


12 2 
4 3 2 |in M3x3(F). 
1 4 2 


Exercise 442 
Does there exist a real number b such that the matrices 


10 -1i 0 b -1 -1 0 
01 0 -i -1 b 0 -I 
slig =i pO Pa a oe et 
01 0 -I 0 1-1 §$ 


are a commuting pair in M4x4(R)? 
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Exercise 443 
Let F = GF(7) and let K be the subalgebra of M2,2(F) consisting of all ma- 


trices of the form = A , for a,b € F. Show that K is a field. Is it a field if 


b 
F = GF (5)? 
Exercise 444 


Let A = n l € Mə2x2(R). Find the set of all matrices B € Mnxn(R) satis- 


fying BA= AB. 

Exercise 445 

Let A= E l € M?2x2(C). Find a complex number c satisfying (cA? =A. 
Exercise 446 


0 1 
LetA=]|0 0 € M3x3(R). Find a positive integer k satisfying A‘ = A7!. 
1 0 


oro 


Exercise 447 


0 0 1 
Let F be a field. Find all matrices A € M3,.3(F) satisfying A7=|0 0 0 
0 0 0 


Exercise 448 

Let n be a positive integer and let F be a field of characteristic 0. Show that 
AB — BA # I forall A, B E€ Mnxn(F) (in other words, that J is not the product 
of any two elements of the Lie algebra Myx (F`). 


Exercise 449 
Show that there are infinitely-many pairs (a, b) of real numbers satisfying the 


a 0 0 1 0 1 1 0 1 a 0 0 
condition| 0 1 0 Oo 1 0|=|0 1 O 0 1 0 
0 0 b 1 0 1 1 0 1 0 0 b 
Exercise 450 
Does there exist a positive integer k satisfying 
o ı o]ļfo o 1}* fo 01 
0 0 1 0 1 0| =|1 0 Of? 
1 0 0 1 0 0 0 1 0 
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Exercise 451 
Let F = GF(3). Show that there exist at least 27 distinct matrices A in M3x3(F) 


satisfying A? = /. 


Exercise 452 
If F = GF(2), find the set of all pairs (A, B) of matrices in M2x2(F) satisfying 
AB-—BA=I. 


Exercise 453 


For a field F, find {A € M2x2(F) | A? = 0}. 


Exercise 454 


1 1 -1l 1 -1 3 
Find a matrix A € M3,3(R) satisfying A | 2 1 O;=] 4 3 2 
1 -1 1 1 —2 5 


Exercise 455 


a 1 0 
Show thatifA=]0O a 1 | € M3x3(R) then for each n > 1 we have A” = 
0 0a 
a” na”! g 
0 a” na"—! | : 


Exercise 456 


Let (K, e) be an associative unital algebra over a field F and let S be the subset 
vi Og v3 
of M3x3(K) consisting of all matrices of the form | Og v22 Ox |.Is S an 


v31 OK v33 
F-subalgebra of M3x3(K)? 


Exercise 457 


Let n be a positive integer and let F be a field. A matrix in Mnxn(F) of the form 


üi a2... an 
n a, oes An] ; f . . 

is called a circulant matrix. Determine if the set of all 
ag a3 ... ai 


circulant matrices in My xn»(F) is an F-subalgebra of Mnxn(F). 
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© Collections artistiques de l’ Université de Liège. Reprinted with kind permission. 
Circulant matrices, which have many important applications, were 
first studied by the nineteenth-century French mathematician Eugéne 
Catalan. 


Exercise 458 


Let n be a positive integer and let F be a field. If A € Mnxn(F) is a nonsingular 
circulant matrix, is A~! necessarily a circulant matrix? 


Exercise 459 
b 


a 
2b al’ 
where a, b € Q. Show that K is a Q-subalgebra of M2 x2(Q) which is, in fact, a 
field. 


Let K be the subset of M2x2(Q) consisting of all matrices of the form | 


Exercise 460 
Find a matrix A € M2x2(R) satisfying A? = E il 


Exercise 461 


1 1 1 
Find all matrices A € M3x3(R) satisfying A| 2 2 2ļ|=0. 
O 1 1 


Exercise 462 


Let A = ° 4 € M2 x2(R) be an idempotent matrix. Show that a + d € 


d 
{0,1,2}. 


Exercise 463 


n n-1 n-1 
Show that l 4 = Ee ae for alln > 1. 


Exercise 464 


Let F be a field and let A = [° p 


| € Mox2(F). Find A” for all n > 1. 


Exercise 465 
Find matrices A, B € M2x2(Q) for which 


(A — B)(A+ B) 4 A? — B?. 
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Exercise 466 


Let F be a field and let A = € M3x3(F). Show that Ak+? = A‘ + 


~ Orr 
— OO 
oro 


A? — I for all positive integers 


Exercise 467 
Let n be a positive integer and let F be a field. Let A, B E€ Myxn(F) satisfy 
A + B = I. Show that AB = O if and only if A and B are idempotent. 


Exercise 468 

Let n be a positive integer and let (K, e) be an associative unital algebra over a 
field F. Define a new operation L] on Mpnxn(K), called the Schur product (some- 
times also called the Hadamard product, especially in the context of statistics), 
by setting [v;;] O [wij] = [vij © wij], for all 1 <i, j < n. Is Maxn(K), +, C) 
an F-algebra? Is it associative? Is it unital? When is it commutative? 


Exercise 469 
Let n be a positive integer and for each A = [ajj] € Mnxn(R), let u(A) 
max} <j, j<n |aij|. Show that (A?) <np(A)? for all A € Mnxn(R). 


Exercise 470 
0 1 0 
Let F be a field. Find a matrix A € M3,3(F) satisfying A7=|0 0 0 | or 
0 0 0 
show that no such matrix exists. 


Exercise 471 
Find a matrix A € M2x2(Q) satisfying A [° l AT = E 0 | for all 


c 0 
cEQ. 
Exercise 472 


Let F be a field and let n be a positive integer. Show that H};AM\,BA\; = 
Ay, BHAA, for all A, Be Mnxn (F). 


Exercise 473 
Let F be a field and let n be a positive integer. Show that (77) X j=1 Hij AHji)B 
— BO, pee Hy; AH jj) for all A, Be Maxn(F). 


Exercise 474 
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Exercise 475 
Find infinitely-many triples (A, B, C) of nonzero matrices in M3,.3(Q), the en- 
tries of which are nonnegative integers, satisfying the condition A? + B? = C?. 


Exercise 476 
Let F be a field. Find a matrix A € M4x4(F) satisfying A4 = I 4 A?. 


Exercise 477 

Let n be a positive integer and let F = GF(p) for some prime integer p. 
Show that for any A € Mpnxn(F) there exist positive integers k > h satisfying 
A* = A". Would this also be true if we chose F = Q? 


Exercise 478 

Let A = [ajj] E€ M2x2(C) be a matrix satisfying the condition that slau + 
an] Æ ./a11422 — a12a21. Show that there exist four distinct matrices B € 
Mo x2(C) satisfying B? =A. 


Exercise 479 
Let c be a given complex number. Find the set of all matrices A € M2x2(C) 
satisfying (A — c1}? = O. 


Exercise 480 
3— 4c 2— 4c 2— 4c 
Show that | —1 + 2c 2c —1+2c | is involutory for all complex num- 
—3+2c —3+2c -—2+42c 
bers c. 


Exercise 481 

Let n be a positive integer and let F be a field. How many matrices A = [a;;] € 
Mnxn(F) having entries in {0, 1} satisfy the condition that each row and each 
column contain exactly one 1. 


Exercise 482 
Show that for an integer n > 4 and for a field F there exist matrices A and B in 
Mnxn(F) satisfying A? = B? = O but AB = BA £ O. 


Exercise 483 

Let F = GF(2) and let F’ be a field of characteristic other than 2. Define a 
function g : M2x2(F') > M2x2(F) as follows: If A = [aij] € M2x2(F") then 
set (A) = [b;;], where 


brs 1 if a;; #0, 
“10 otherwise. 


Is g(A + A’) = (A) + @(A’) for all A, A’ € Moy2(F’)? Is ọ(AA^ = 
(A)g(A’) for all A, A! € M2x2 (F^)? 
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Exercise 484 
Find a matrix J 4 A € M3x3(Q) satisfying 


1 0 0 1 0 0 1 0 0 1 0 0 
Aj}1 1 OJ=]1 1 OJA and AJO 1 O}J=]0 1 OJA 
0 0 1 0 0 1 O 1 1 01 1 


Exercise 485 
For each real number a, find a matrix B(a) € M2x2(R) satisfying 


eae cane) |= 8 1 | 
E 1 


sin(a) cos(a) sin(a) 
or show that such matrices need not exist. 


Exercise 486 


5 


Let A = E 


A € M2x2 (R). What is A14? 


Exercise 487 


Find all pairs (a, b) of rational numbers such that the matrix A = E z] E€ 


M2 x2(Q) is idempotent. 


Exercise 488 

Let F be a field and let n be a positive integer. Show that there do not 
exist nonsingular matrices P, Q € My xn(F) satisfying PAQ = A’ for all 
A € Mnxn(F). 


Exercise 489 

Let F be a field and let A, B E€ Mnxn(F) be a commuting pair of matrices, 
where B is nonsingular. Is (A, B7!) necessarily a commuting pair? 

Exercise 490 


Let F be a field. Is S = {|< | 
c d 
M2x2(F)? 


a+c=b+ a) an F-subalgebra of 


Exercise 491 

Let F be a field of characteristic other than 2, let n be a positive integer, and let 
A € Mnxn(F) be an involutory matrix. For each c € F, let Be = c(A + I). For 
which values of c do we have B2 = B,? 


Exercise 492 


Find all rational numbers a, b, and d satisfying the condition that [| | € 


Mo x2(Q) is involutory. 
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Exercise 493 
Let F = GF(3) and let A = x o € M3x3(F). Show that the subset 


{O,1,21,A,1+A,21+A,2A,1+2A,21+2A} of M3,3(F) is a field under 
addition and multiplication of matrices. 


Exercise 494 
Let F = GF(p), where p is a prime integer, and let K be the subset of M2x2 (F) 


b ? , where a,b € F. Show that K, 


together with the operations of matrix addition and multiplication, is a field when 
p =3 and is not a field when p = 5. What happens when p = 7? 


oe ; a 
consisting of all matrices of the form | 


Exercise 495 
Let n be a positive integer, let F be a field, and let O 4 A, BE Mnxn(F). Show 
that there exists a matrix C € Mnxn(F) satisfying ACB 4 O. 


Exercise 496 
Find all matrices A, B € M2x2(R), the entries of which are nonnegative inte- 
gers, which satisfy AB = ‘ ; 
Exercise 497 
Let V = M3x3(Q). For each rational number f, let œ; : V > V be the linear 


0 1 3 
transformation At A | t 0 O |. Is the function t +> a; a linear transfor- 
0 -1 4 


mation from Q to End(V), both considered as vector spaces over Q? 
Exercise 498 


Let n be a positive integer, let F be a field, and for some fixed c € F, let A = [a;;] 
be the matrix in My x»(F) defined by 


c wheni + j is even, 
aij = : 
1J 0 otherwise. 


Show that the subset {A, A?, A3} of Maxn(F) is linearly dependent. 


Exercise 499 


0001 
1001 

Let F = GF(2) and let A=] 1 9 go | © Maxa(F)- Let L= {0} U 
0010 


{Al |i > 0} C M4x4(F). Show that L is closed under addition. Is L, under the 
usual definitions of addition and multiplication of matrices, a field? 
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Exercise 500 
Let K be the set of all matrices in M2x2(Q) of the form ; Be Show that 


K is a subalgebra of M2x2(Q) which is in fact a field. 


Exercise 501 
Find the set of all matrices A € M2x2(Q) which satisfy A? + A = | Ë i 


Exercise 502 
Let A = 7 ol € Mə2x2(Q) and let B and C be matrices in M2x2(Q) satis- 
fying AB = BA and AC = CA. Show that BC = CB. 


Exercise 503 
Find infinitely-many matrices A E€ M3x3(Q) satisfying 


1 -1 2 1 2 0 1 
A|2 0 1);==]/0 2 -3 
3 -1 3] ?loo o0 
Exercise 504 
1 -1 —1 
Let A = | —1 1 —1 | € M3x3(Q). Find functions f and g from the set of 
-1 -1 1 


fa) gmn) gin) 
all positive integers to Q satisfying the condition that A” = | g(n) f(n) g(n) 
gin) gan) f) 


forall n > 1. 


Exercise 505 
Let F = GF(2). Do there exist matrices A = [a;j] and B = [b;;] in M2x2(F) 
satisfying a11 + az = 1, b11 + b22 = 0, and AB = I? 


Exercise 506 

Let F be a field and let G be the set of all matrices in M3x3(F) of the form 
1 0 0 
a 0 O |, where a,b € F. Is G closed under matrix multiplication? Does 
0 0 b 

there exist a matrix J in G satisfying the condition that AJ = A for all A € G? 

If such a matrix J exists, is it necessarily true that J A = A for all A € G? 


Exercise 507 
Let n be a positive integer and let F be a field. Let A and B be matrices in 


I A I B' ; i g 
Maxn(F) of the form 0 I and o ıb respectively, where A’ and B 
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are (not-necessarily square) matrices of the same size. Find necessary conditions 
for A and B to satisfy AB = BA. 


Exercise 508 
Let F be a field and let A, B € M2x2(F). Show that (AB — B A)? isa diagonal 
matrix. 


Exercise 509 

Let n be a positive integer and let F be a field. Let A E€ Mnxn(F) be a diagonal 
matrix having distinct entries on the diagonal. Let B € Mnxn(F) be a matrix 
satisfying AB = BA. Show that B is also a diagonal matrix. 


Exercise 510 

Let n be a positive integer and let F be a field. For each integer —n < t < n, let 
D;(F) be the set of all matrices A = [ajj] € Mnxn(F) satisfying the condition 
that a;; = 0 when j #i + t. Thus, for example, Do(F) is the set of all diagonal 
matrices in Mnxn(F). If A € D;(F) and B € D;(F), does there necessarily exist 
an integer —n < u < t such that AB € D,(F)? 


Exercise 511 


1 2 3 1 2 3 
Let A=] —-1l —2 —3 |an B=[|0 0 0| be matrices in M3x3(R). 
2 4 6 0 0 0 


Find infinitely-many lower-triangular matrices C satisfying A= CB. 


Exercise 512 

Letn be a positive integer and let F be a field. Let A,,..., A, be upper-triangular 
matrices in My»(F) satisfying the condition that the (i,7)-entry in A; is equal 
to 0 for 1 <i <n. Show that A,---A, = O. 


Exercise 513 

Let F be a field in which we have elements a 4 0 and b. Show that there exists 
an upper-triangular matrix C € M2x2(F) satisfying k d C= E a Is C 
necessarily unique? 


Exercise 514 
Let F be a field. Find an element A of M2x2(F) satisfying AAT # ATA. 


Exercise 515 
Let F be a field and let n > 1. If a matrix A € Mnxn(F) satisfies AAT = O, 
does it necessarily follow that AT A = O? 


Exercise 516 
Let n be a positive integer, let F be a field, and let A € Mnxn(F) satisfy the 
condition A = AAT. Show that A? = A. 
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Exercise 517 
Let n be a positive integer, let F be a field, and let A, B € Mnxn(F) be symmet- 
ric matrices. Is ABA necessarily symmetric? 


Exercise 518 
Let n be a positive integer and let F be a field. If A € My y.,(F) is symmetric, is 
A” symmetric for all h > 1? 


Exercise 519 


1 -2 1 3 —1 1 : 
Show that ÂE | ; E ‘| ; | 1 El forms a basis for the subspace 


of M2 x2(Q) consisting of all symmetric matrices. 


Exercise 520 


Does there exist a matrix A E€ M2x2(R) satisfying AA! = E d ? 


Exercise 521 
Given real numbers a, b, and c, find all real numbers d such that 


0 0 oO -I a bcil 
0 0 -l a 1 000 
0 -l a b O d 0 0 
-l a b c 0 0 1 0 


is symmetric. 


Exercise 522 
Find a matrix B € M2 2(Q) such that the Nievergelt’s matrix equals B'B. 


Exercise 523 


1 2 -3 
Calculate | 0 1 2 in M3 3(R). 
0 0 1 
Exercise 524 
=ù 
a 1 
Leta ERN {1, —2}. Calculate | 1 a 1 e M3x3(R). 
1 1 a 
Exercise 525 
—3 4 0 


Does there exist an a € R such that 8 5 —2 | e M3,3(R) is singular? 
a —7 6 
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Exercise 526 

Let n be a positive integer. Each complex number c defines a matrix A(c) = 
[aij] € Mnxn(C) given by aj; = c@-DG-D for all 1 <i, j <n. If w = ei e 
C, show that A (w) is nonsingular and satisfies A(w)7! = 1 A(wo!). 


Exercise 527 
Let n be a positive integer and let F be a field. Given a matrix B € Myxn(F), 


; —Bv |. 
do there exist vectors u, v € F” such that the matrix T T is non- 
—u' Bu’ Bu 
singular? 
Exercise 528 


1+x -X 


Is the matrix | x I-X 


| € M2x2(Q[X]) nonsingular? 


Exercise 529 

2 l-a 0 

Is the matrix 0 l—a? 1—a | €M3,3(C) nonsingular, where a = 
l-a 0 1-a? 

—5+4/-3€C. 


l-a 


Exercise 530 
Let n be a positive integer and let F be a field. If A € My xn(F) nonsingular, is 
the same necessarily true for A + AT? 


Exercise 531 

Let n be a positive integer and let F be a field. Let A = [a;;] € Mnxn(F) satisfy 
the condition that )~"_, a; j= 1 for all 1 < j <n. Show that the matrix J — A is 
singular. 


Exercise 532 
Let n be a positive integer and let F be a field. If A € My xn(F) is a Markov 
matrix, is A~! necessarily a Markov matrix? 


Exercise 533 
Let n be a positive integer and let F be a field. For A E€ Myxn(F), show that A? 
is nonsingular if and only if A? is nonsingular. 


Exercise 534 
Let F = GF(p), where p is a prime integer, and let n be a positive integer. What 
is the probability that a matrix in Mnxn(F), chosen at random, is nonsingular? 
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Exercise 535 

0 -l 
1 —1 
Mo 2(R). Set B = AQ7'PQ. Show that B is nonsingular and A`! + B7! = 
(A+BY!. 


Let P = | | € M2x2(R) and let A and Q be nonsingular matrices in 


Exercise 536 
Show that there are infinitely-many involutory matrices in M2x2(Q). 


Exercise 537 
Let F = GF(2). Is the sum of all nonsingular matrices in M2x2(F) nonsingular? 


Exercise 538 
Let F be a field and let U be the set of all nonsingular matrices in M2x2(F). Is 
the function 8 : U —> U defined by 6: A œ> A? a permutation of U? 


Exercise 539 
Let n be a positive integer, let F be a field, and let A E€ Moy x2, (F) be a matrix 


which can be written in the form ie l where each Ajj € Mnxn(F) is 


nonsingular. Is A necessarily nonsingular? 


Exercise 540 
Let n be a positive integer and let F be a field. Do there exist matrices A, B € 


2 
Mnaysxn(F) such that the matrix | A AR 


BA R | € Monx2n(F) is nonsingular? 


Exercise 541 
Let n be a positive integer and let F be a field. For A, BE Myy,(F) with A 
nonsingular, show that (A + B)AT! (A — B) = (A — B)A7!(A+ B). 


Exercise 542 

Let n and p be positive integers and let F be a field. Let A € Mnxn(F) and let 
B,C € Myx p(F) be matrices satisfying the condition that A and (J + C’A~'B) 
are nonsingular. Show that A + BC’ is nonsingular, and that 


(A+ BCT)! =A — A“'B(T +. C7 AB) "CT AT, 
Exercise 543 
Let n be a positive integer and let F bea field. If | : | Æ v € F”, show that there 


0 
exists a nonsingular matrix in Mnxn(F) the rightmost column of which is v. 
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Exercise 544 
Let F be a field. Show that every nonsingular matrix in M2x2(F) can be written 


as a product of matrices of the form i I | i or | 


a 0 
1 0 0 1 ]foraer, 


0 1 


Exercise 545 


1 0 t 
For each real number f, let A(t) = t 1 5t? M3 3(R). Show that 
0 0 1 


each such matrix is nonsingular and that the set of all such matrices is closed 
under taking products. 


Exercise 546 

Let n be a positive integer and let F be a field. Let A E€ Mnxn(F) be a matrix for 
which there exists a positive integer k satisfying A‘ = O. Show that the matrix 
I — A is nonsingular and find (J — AVE. 


Exercise 547 

Let n be a positive integer and let F be a field. Let A € Mnxn(F) be a matrix for 
which there exists a matrix B € Mnxn(F) satisfying J + A + AB = O. Show 
that A is nonsingular. 


Exercise 548 

Let n be a positive integer and let F be a field. Let A, B E€ Mnxn(F) satisfy the 
condition that A and A + B are nonsingular. Show that / + AT! B is nonsingular 
and that (7 + A~!B)-! = (A + B)7!A. 


Exercise 549 
Find matrices A and B in M2x2(R) satisfying A? = B? = O such that A + iB 
is a nonsingular matrix in M2x2(C). 


Exercise 550 


Let F be a field and let A = 


a Om 


0 b 
1 0 | € M3x3(F), where ab 4 1. Show that 
0 1 
A is nonsingular and calculate A~!. 


Exercise 551 


Let c 4 0 be an element of a field F and let A = e Maxa(F). 


ooon 
coon = 
oo FO 
te) 


Is A is nonsingular? If so, find A~!. 
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Exercise 552 

Letn > 1 and let B E€ Myx» (Q) be the matrix all of the entries of which are equal 
to 1. Show that there exists a matrix A E€ M nxn(Q) satisfying the condition that 
A + cB is nonsingular for all rational numbers c. 


Exercise 553 
Let n > 1 and let B € My xn(Q) be the matrix all of the entries of which are 
equal to 1. Find a rational number ¢ such that (J — B)-!=1-tB. 


Exercise 554 
Let n be a positive integer and let A = [a;;] € Myx, (R) be the matrix defined 
by aij = min{i, j} for all 1 <i, j < n. Show that A is nonsingular. 


Exercise 555 
Let A = [aij] E€ M4 4(R) be the matrix defined by 


2 ifi=j-1, 
lij =) 1 otherwise. 


Show that A is nonsingular and calculate A~!. 


Exercise 556 
cos(a) sin(a) , 
— sin(a) > € M2x2(R). Given 
G(a) G(b) 
O G(c) 


For each real number a, let G(a) = | 


real numbers a, b, and c, show that G (a, b, c) = | | € Ma4x4(R) is 


nonsingular, and find G(a, b, oL. 


Exercise 557 
Find a singular matrix in M3x3(Q) the entries of which (in some order) are the 
integers 1,2,...,9. 


Exercise 558 
Let n be a positive integer and let F be a field. Given elements b,c € F, let 
A = [aij] € Mnxn(F) be the matrix defined by 


ae b ifi=j, 
“i =) e otherwise. 


Find necessary and sufficient conditions for A to be nonsingular. 
Exercise 559 


Give an example of a singular matrix in M3,.3(Q) the entries of which are dis- 
tinct prime positive integers, or show that no such matrix can exist. 
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Exercise 560 


Let F be a field and let D = | f 


1 0 
the condition that A? DA = D. Show that A is nonsingular. 


| E€ Mox2(F). Let A E€ M2x2(F) satisfy 


Exercise 561 
Let n be a positive integer and let F be a field. Is the set of all singular matrices 


in My xn(F) closed under taking products? 


Exercise 562 


Let n be a positive integer, let F be a field, and let A, B E€ Mnxn(F). Show that 


d € Monx2n (F) 


A and B are both nonsingular if and only if the matrix E B 


is nonsingular. 


Exercise 563 


Write the matrix | ; E € Mə2x2(R) as a product of elementary matrices. 


Exercise 564 


Find the change of basis matrix from the canonical basis B of R? to the basis 


1 1 1 
D= 1],/1],)0 and the change of basis matrix from D to B. 
1 0 0 


Exercise 565 


Let G = | f h | 0 Æa eR}. Show that there exists a matrix E € G satisfy- 


ing the condition that EA = A = AE for all A € G. For each A € G, show that 


there exists a matrix AŤ € G satisfying AA* = E = A*A. 


Exercise 566 
Let F be a field. Given matrices A, B € Mo,.2(F), find the set of all matrices 


C € M2x2(F) satisfying (AB — BA)C = C(AB — BA). 
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Exercise 567 
Let F be a field and let G be the set of all automorphisms of F? which are rep- 


; i i : a b 
resented with respect to the canonical basis by a matrix of the form | 0 al | 


Is G a group of automorphisms of F*? 


Exercise 568 
Let G be the set of all automorphisms of Q? which are represented with respect 


i 4 | where a,d>0.IsGa 


to the canonical basis by a matrix of the form | 0 d 


group of automorphisms of Q?? 


Exercise 569 

Let W1 C W2 C--- C W, be a fixed sequence of subspaces of a vector space V 
finitely generated over a field F. If a € Aut(V), we say that given sequence 
is an a-fan if and only if each of the W; is invariant under a. Show that 
G = {a € Aut(V) | the given sequence is an a-fan} is a group of automorphisms 
of V. 


Exercise 570 

For any real number ¢ and any positive integer n, we can define the matrix 
P(n,t) E€ Mnxn(R) to equal the identity matrix J in the case t = 0 and oth- 
erwise to equal the matrix [ p;;] defined by 


0 if i <j, 
Pij = (Zie otherwise. 
j 


Show that P(n, s)P (n, t) = P(n, s + t) for all s,t € R. In particular, show that 
each matrix P(n, t) is nonsingular. 


Exercise 571 
Let F be a field and let X be an indeterminate over F. Find matrices P and Q in 


2 
M2 x2(F[X]) such that the matrix P | IRA x 


xX 1+ A Q is a diagonal matrix. 


Exercise 572 
Let n be a positive integer and let œ : M2x2(C) > M4x4(R) be the function 


a b c d 

.|a+bi c+di =b a =d č a foat 

defined bya: | 27 a Ef ek . Show that « is a lin- 
-f e -h g 


ear transformation of vector spaces over R. Is it a homomorphism of unital R- 
algebras? 
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Exercise 573 
Let F be a field and let A € M2x2(F). Explicitly find a nonsingular matrix 
P € M2x2(F) satisfying PAP~! = A’. 


Exercise 574 

Let Y be the subspace of M3,3(R) consisting of all skew-symmetric matrices. 
Show that Y is isomorphic to R? and find an isomorphism a : R? — Y satisfying 
the condition that a(v)w = v x w for all v, w € R?. 


Exercise 575 


Let A = k ‘| € M>,2(R). Does there exist a matrix B € M>,.2(R) sat- 
isfying B? = A? Does there exist a matrix C € M4,.4(R) satisfying C? = 
A O 
9 
O Al’ 
Exercise 576 
1 1 0 0 
0 1 0 0 : ; ; 
Let A = 0011 E€ M4x4(Q). Find the set of all monic polynomials 
000 1 


D(X) € Q[X] of degree 2 satisfying the condition that p(Ay* = A. (Caution: this 
set may be empty.) 


Exercise 577 

(Simpson’s rule) Let a < b be real numbers and let c = (a + b) be the midpoint 

of the interval [a, b]. Given a continuous function f € R!®»1, use Lagrange inter- 
. b : i = 

polation to show that f a f(t) dt is approximately equal to a [f(a)+4f(ce)+ 

f (6). 


The eighteenth-century British mathematician Thomas Simpson was 
noted for his work on numerical approximations in calculus. 


Exercise 578 
Let F be a field and let k < n be positive integers. Let A € My xn(F) be writ- 
Ai Arn 


, Where Aj; is nonsingular. Let v, w € Fk 
A21 P] 8 


ten in block form as | 
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=[ Ai Ay Ap 
Ap Ac! An — Ar Aq} 412 


[eleet] 


and v’,w’ € F"-* and let B | Show that 


Systems of Linear Equations 1 0) 


Let k and n be positive integers. The classical problem of linear algebra is to find all 
solutions (if any exist) to a system of k linear equations in n unknowns of the form 


ay Xi He + anna = bi 
an Xi ++ anA = bz, 


akı Xı + +++ +aknXn = bk, 


where the a;; and the b; are scalars belonging to some field F and the Xj; are 
variables which take values in the field. 

What about infinite systems of equations? The study of infinite systems of linear 
equations over R was indeed initiated by Hill and formalized by Poincaré but has 
since been subsumed into functional analysis and will not be considered here. It is 
known that every finite subsystem of an infinite system of linear equations over an 
arbitrary field F has a solution over F if and only if the infinite system has a solution 
over F. 


With kind permission of the American Mathematical Soci- 
ety (Hill); With kind permission of the AIP Emilio Segre Vi- 
sual Archives, Physics Today Collection and Tenn Collection 
(Poincaré). 

George William Hill was a nineteenth-century 
American mathematical astronomer. French 
mathematician Jules Henri Poincaré was one of 
the foremost mathematical geniuses of the late 
nineteenth century. 


Example Let a < b be real numbers and let V = C (a, b). If W is a subspace of 
V of dimension n then the interpolation problem of V is the following: given a 
function f € V and given real numbers a < tı <--- < tn < b, find a function g € W 
satisfying f (t;) = g(t;) for 1 < j <n. If we are given a basis {g1,..., &n} of W 


J.S. Golan, The Linear Algebra a Beginning Graduate Student Ought to Know, 189 
DOI 10.1007/978-94-007-2636-9_10, © Springer Science+Business Media B.V. 2012 
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then we want to find real numbers c1, ..., Cc, satisfying pn cigi (tj) = f (tj) for 
all 1 < j <n. In other words, we want to solve a system of linear equations of the 
above form, where k =n, aij = gj (ti) and b; = f (ti) for all 1 <i, j <n. 


Example In Proposition 4.2, we noted that if F is a field and if f (X) and g(X) #0 
are elements of F[X], then there exist unique polynomials u(X) and v(X) in F[X] 
satisfying f(X) = g(X)u(X) + v(X) and deg(v) < deg(g). If we set 


k n 
g(X)=)oajX' and f(x)= bixi, 
i=0 i=1 
then the coefficients of u(X) = ae ci XÍ are found by solving the system of linear 
equations 


akYo +ak—-1Yı + --- +aoYk = by, 
akYı + ag—-1¥2 +--+ +a0Yk+1 = bk-1, 


AY n—k—1 + ák—-1Yn-k = bn-1, 
akYn—k = bn 


by any of the methods we will discuss. 


Example Sometimes we can transform systems of nonlinear equations into systems 
of linear equations. For example, suppose that we want to find positive real numbers 
r1, r2, and r3 satisfying the following nonlinear system of equations: 


ryror3 = 1, 


r3/rir2 = 81. 


Since each of the integers on the right is a power of 3, we can take the logarithm to 
the base 3 of both sides of each equation. Setting X; = log3(r;) for 1 <i < 3, the 
system now becomes linear 


X1+X2+ X3=0, 
3X, +2X2 +2X3 =3, 
—Xı — X:+ X3 =4, 


and this has a unique solution (which we can find by methods to be discussed in 
this chapter) X; = 3, X2 = —5, and X3 = 2, showing that the original system has a 
solution rı = 27, r2 = 1/243, and r3 = 9. 


A system of linear equations of the above form is homogeneous if and only if 
bi = 0 for all 1 < i < k; otherwise it is nonhomogeneous. At this stage, we do not 
yet know answers to the following questions: 
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(1) Does a given system of linear equations have a solution? 

(2) If it has a solution, is that solution unique? 

(3) If the solution is not unique, can we characterize the set of all solutions? 

(4) If there are solutions, how do we compute them efficiently? 

In order to answer these questions, we have to move to the language of matrices. The 
use of matrices for this purpose was developed in Europe in the nineteenth century 
by Cayley, Sylvester, and Laguerre. However, the real pioneers were the Chinese 
and Japanese mathematicians. During the time of the Han dynasty in China, around 
2000 years ago, the Nine Chapters on the Mathematical Art (Jiuzhang Suanshu) 
presented a method for solving systems of linear equations using matrices. A major 
commentary on this was subsequently written by Liu Hui. This, in turn, formed the 
basis for the later work of Seki. 


Edmond Laguerre, a nine- 
teenth-century French math- 
ematician, wrote an impor- 
tant book on systems of lin- 
ear equations in 1867. Liu 
Hui lived in the third cen- 
tury in the Kingdom of Wei 
in north-central China. He 
added proofs and computational algorithms using counting rods. Takakazu Seki Kowa 
was a seventeenth-century Japanese mathematician, the son of a samurai warrior family, 
who developed matrix-based methods based on Chinese texts. 


To see how this is done, let us write the above system in the form 


Aall --. Qin Xı bı 


Akl .-.. Akn Xn bk 


The matrix A = [aij] € Mxxn(F) is the coefficient matrix of the system. If we set 


bı 
w=|:J]€ FE, then the matrix [A w] € Mxx(n41)(F) is called the extended 
bk 
dı 
coefficient matrix of the system. The set of all vectors v= | : | € F” satisfying 
dn 


Av = w is the solution set of the system. This is clearly equal to a~!(w), where 
a: F” — FF is the linear transformation satisfying ®gp(a) = A, where B and D 
are the canonical bases of F” and F*, respectively. In particular, if the system is 
homogeneous then its solution set is just the kernel of œ, and is called the solution 
space of the system. 
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We note the following simple but important point: if F is a subfield of a field K 
and if k and n are positive integers, then any matrix A in M;,.,(F) also belongs 
to Mxxn(K) and any vector v € F” also belongs to K”. Therefore, if w € F*, any 
element of the solution set of Av = w, considered as a system of linear equations 
over F, remains a solution when we consider this as a system of linear equations 
over K. 


Proposition 10.1 The solution set of a homogeneous system of linear equa- 
tions inn unknowns is a subspace of F”. 


Proof This is a direct consequence of Proposition 6.4. 


For nonhomogeneous systems, the situation is a bit more complicated. 


Proposition 10.2 Let AX = w be a nonhomogeneous system of linear equa- 
tions in n unknowns over a field F and let vo € F” be a solution to this sys- 


tems. Then the solution set of the system is the set of all vectors in F” of the 
0 


form vo + v, where v is a solution to the homogeneous system AX = 


Proof This is an immediate consequence of Proposition 6.6. 


We should emphasize that the solution set of a nonhomogeneous system of linear 
equations is not a subspace of F” but rather an affine subset of that space. 


Example If we identify R? with the Euclidean plane by associating each vector H 


with the point with coordinates (a, b), then we see its subspaces of dimension 1 are 
precisely the straight lines going through the origin. The solutions of linear equa- 
tions of the form a; Xı + a2 X2 = b, where b Æ 0, and at least one of the a; is also 
nonzero, are the straight lines in the plane which do not go through the origin. 


We are still left with the question of how to actually find a solution to a system 
of linear equations. Here we can distinguish between two approaches: 

Direct Methods These methods involve the manipulation of the matrix A, either 
replacing it with another matrix which is easier to work with or factoring it into a 
product of matrices which are easier to work with, and thus reducing the difficulty 
of the problem. 

Iterative Methods These methods involve selecting a likely solution for the system 
and then repeatedly modifying it to obtain a sequence of vectors which (hopefully) 
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will converge to an actual solution to the system. Such methods work, of course, 
only if our vector space is one in which the notion of convergence is meaningfully 
defined. As we shall see, this is possible when the field of scalars is R or C. 

We begin by looking at direct methods. Let P be a nonsingular matrix in 
Mkxk(F). A vector v € F” is a solution to the system AX = w over F if and 
only if it is a solution to the system (PA)X = Pw. In particular, this is true for 
elementary matrices. Thus, given a system of linear equations, we can change the 
order of the equations, multiply one of the equations by a nonzero scalar, or add a 
scalar multiple of one equation to another, without changing the solution set of the 
system, so long as we do the same thing on both sides of the equal sign. In order to 
do this efficiently, it is best to work with the extended coefficient matrix [A w] and 
perform elementary operations on it to reduce it to a convenient form. 

Let F be a field, let k and n be positive integers, and let B = [b;;] € Meixn(F). 
The matrix B is in row echelon form if and only if for each 1 <i < k there exists an 
integer 1 < s(i) <n + 1 such that 
(1) bij; =0 forall 1 < j < s(i) but bi sa) AO if s(i) < n; and 


(2) s(1) < s(2) <- -- <s(k). 
| and 
0 
4 
1 
7 


6 7 7 1 8 0 0 0 0 
i 9 2 1 1 000 2 6 ; 
Example The matrices 0022 000 0 0| Hf in row 
000 1 000 0 0 


1 
0 
0 
0 
1 
echelon form. The matrix f is not in row echelon form. 
0 


Example If n is a positive integer and if B € My xn(F) is in row echelon form, 
102 7 
; ; 0 0 3 8f. ; 
then B is surely upper triangular. However, 009 0| 5® upper-triangular 
0005 


matrix which is not in row echelon form. 


We claim that for any matrix A = [ajj] € Mkxn(F) is row equivalent to a matrix 
in row echelon form. By Proposition 9.4, this is equivalent to saying that A can be 
transformed into a matrix in row echelon form by a series of elementary operations, 
as follows: 

(1) Find the leftmost column of A which has a nonzero entry and interchange rows 
if necessary, so that this entry is in the first row. Thus we now have a matrix A 
in which aj, 4 0 and a;; = 0 forall 1 <i < k andalll <j <h. 

(2) For each 1 <i < k, if ain 40 then we multiply the first row by -aijan and 
add it to the ith row, which creates a new row in which the (i, h)-entry is equal 
to 0. Thus, we now have a matrix in which a;, = 0 for all 1 <i < k. 
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(3) Now consider the submatrix of A from which we deleted the first row and the 
first h columns, and repeat the above procedure. 


1 2 3 1 
Example Let us begin with the matrix A = | 2 1 4 2 |e M3x4(R). We al- 
1 —1 1 1 
ready have a1; # 0. Multiplying the first row by —2 and adding it to the second 
1 2 3 1 
row, we obtain | 0 —3 —2 0O | and then multiplying the first row by —1 and 
1 -l1 1 1 
1 2 3 1 
adding it to the third row, we obtain | 0 —3 —2 0 |. We also already have 
0 -3 -2 0 


an # 0. Multiplying the second row by —1 and adding it to the third row, we obtain 
1 2 3 1 
0 —3 —2 0 }, and this is in row echelon form. 
0 0 00 


If A = [aij] € Mkxn(F) is a matrix in row echelon form, and if the Ath row of A 
contains nonzero entries, then the leftmost nonzero entry of the row is the leading 
entry. The matrix A is in reduced row echelon form ìf it is in row echelon form and, 
in addition, satisfies the following additional conditions: 

(1) The leading entry in each nonzero row is equal to 1; 

(2) If anj is a leading entry, then a;; = 0 for all i # h. 

Any matrix in row echelon is row-equivalent to one in reduced row echelon form; 
that is to say, such a matrix can be converted to one in reduced row echelon form by 
performing additional elementary operations: first, we multiply each nonzero row 
by the multiplicative inverse of its leading entry, to obtain a matrix in which the 
leading entry of each nonzero row equals 1. Then, if ay; is a leading entry and if 
i < h, we multiply the hth row by —a;;j and add it to the ith row, which will give 
us a matrix with the (i, j)-entry equal to 0. The reduced row echelon form of any 
given matrix is clearly unique. 


1 2 3 1 
Example Let us go back and look at the matrix | 0 —3 —2 0 |in row echelon 
o 0 00 
form. The leading entry of the first row is already equal to 1. Multiplying the second 
1 2 3 1 
row by -4 to obtain, | O 1 Z 0 |, a matrix in which the leading entry of the 
0 0 0 0 
second row is equal to 1 as well. Now multiply the second row by —2 and add it to 
10 $ 1 
the first row, to obtain | Q 1 A 0 |, which is in reduced row echelon form. 
0000 


10 Systems of Linear Equations 195 


Example Even this very simple algorithm can lead to computational problems. Let 
n be a positive integer and let A = [aij] € Mnxn(R) be the matrix defined as fol- 
lows: 
1 ifi=jorj =n, 
aij = —1 ifi >j, 
0 otherwise. 


If we use the above method to reduce A to reduced row echelon form we obtain a 
matrix B = [bij] where 


1 ifi=j <n, 
bij ={ 27! forj=n, 
0 otherwise. 


If n is sufficiently large, the element ban may be considerably corrupted due to 
roundoff and truncation error. 


Reduction of a matrix in Mkxn(F) to reduced row-echelon form depends 
strongly on the fact that every nonzero element in a field has a multiplicative in- 
verse. If we are considering matrices in Mgxn(K), where K is the unital commuta- 
tive associative algebra of polynomials in one or several variables over F, this now 
longer holds. In such situations, however, it is possible to reduce a matrix to a form 
known as Howell Canonical Form, which is equivalent to row-echelon form with 
leading entries equal to 1 in the case we are working over a field. This is important 
for computations since, as we will see, algebras of the form Mpxxn(F[X]) have an 
important part to play in the theory we are developing. 

Now let us return to the system of linear equations AX = w in n unknowns 
and consider methods of solution. The most well-known is Gaussian elimination or 
the Gauss—Jordan method. In this method, we first perform elementary operations 
on the extended coefficient matrix [A w] to bring it to reduced row echelon form. 
Having done this, we now have a new system of linear equations A’X = w’, the 
solution set of which is the same as that of the original system. Let t be the greatest 
integer i such that the ith row has nonzero entries. There are several possibilities: 
(1) b; #0 but ai; = 0 for all 1 < j <n. Then the system has no solutions, and we 

are done. 

(2) There is precisely one index j such that a; j Æ 0. Then this must in fact be the 
leading entry of the rth row and so a; = 1. This means that in any element of 
the solution set of the system we must have the jth entry equal to bj. We can 
therefore substitute b; for X; in each of the other equations, and reduce the 
system to one of equations of n — 1 unknowns. 

(3) There are several indices j such that a, j Æ 0, say those in columns hı < h < 
+++ < hm. Then ain, is the leading entry of the ¢th row and so equals 1. More- 
over, for any values z1,...,Zm we substitute for Xp,,...,Xn,,, we will get 
a solution to the system with these values and with b; — }7"_, zs substituted 
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for Xn,- Thus we can consider the z; as parameters of a general solution and 
again reduce the system to one in a smaller number of unknowns. 


(4) Having reduced the system, we now recursively apply the previous steps until 
the system is solved. 


With kind permission of the Archives of the Mathematis- 
ches Forschungsinstitut Oberwolfach © Universitat Gottin- 
gen, Sammlung Sternwarte 

Carl Friedrich Gauss, who lived in Germany at 
the beginning of the nineteenth century, is con- 
sidered to be the leading mathematician of all 
times, as well as a physicist and astronomer of the 
first rank. He developed this method in connec- 
tion with his work in astronomy in 1809. Gaussian elimination first appeared in print in 
a handbook by German geodesist Wilhelm Jordan, who applied the method to problems 
in surveying. The first computer program to solve a system of linear equations by Gaus- 
sian elimination was written by Lady Augusta Ada Lovelace, a student of De Morgan and 
daughter of the poet Lord Byron, who developed software for Charles Babbage’s (never 
completed) mechanical computer in the nineteenth century. Her program was capable of 
solving systems of 10 linear equations in 10 unknowns. 


Strassen’s insight that Gaussian elimination may not be the optimal method of 
solving systems of linear equations, as had been previously thought, led to the de- 
velopment of his method of matrix multiplication. 


Example Let us consider the system of linear equations 


3X, +2X2 + X3=0, 
—2X,+ X2— X3=2, 
2X, — X2 + 2X3 = —1 


over the field R. The extended coefficient matrix of this system is 


14 to 
and this is row equivalent to the matrix | 0 4 E 6 | in row echelon form, 
0 0 1 1 
1 0 0 -!1 
which is in turn row equivalent to the matrix | 0 1 0 1 | in reduced row 
0 0 1 1 


echelon form. Thus we see that the solution set of the system is 1 
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Example Let us consider the system of linear equations 


X,+X2=1, 
Xi- X2 =3, 
—Xı +2X2 = —2 


over the field R. The extended coefficient matrix of this system equals 


1 
-1 Sis 
— —2 
1 1 
and this is row equivalent to the matrix E A in row echelon form, which is 
1 0 ral 
row equivalent to the matrix | 0 1 0 | in reduced row echelon form. Therefore, 
0 0 1 


this system has no solutions at all. 


Example Let us consider the system of linear equations 


X,+2X2+ X3=-1, 
2X, +4X2+ 3X3 =3, 
3X) +6X2+4X3 = 2 


1 2 1 -1 
over R. The extended coefficient matrix of this system is | 2 4 3 3 | and 
3 64 2 
1 2 1 -1 
this is row equivalent to | 0 0 1 5 | in row echelon form, which is in turn 
000 0 
1 2 0 —6 
row equivalentto | 0 0 1 | in reduced row echelon form. From the second 
000 0 
row, we see that we must have X3 = 5. From the first row, we have X1 + 2X2 = —6 
and so, for each value X2 = z, we have a solution with X; = —6 — 2z. Therefore, 
—6— 2z 
the solution set to our system is z zeR 


5 
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Gaussian elimination can also be used to check if a set of vectors in F* is linearly 
aij 
independent. Let {v1, ... , Vn } be a set of vectors in F k where vj = : forall j. 
akj 
We want to know if there are scalars b1, ..., bn in F, not all equal to 0, satisfying 
0 


Xi- bjvj =| : |. Thatis, we want to know if the homogeneous systems of linear 


equations AX = | : | has a nonzero solution, where A = [ajj] E€ Mkxn (F). 


Example Let us check if the subset E : ; of Qf is linearly 


a 
RD 


3 
4 


coooco 


-1 

1 

0 

4 
3 
dependent, and to do so we need to consider the matrix A = $ : = 
4 


This matrix is row equivalent to the matrix in reduced row ech- 


ocooor 
cooroeo 
eo oo © 


elon form. Therefore, the set of solutions to the homogeneous system AX = 


—2z —2 
is z ||z €Q}? so that if we pick one such nonzero element, say 1 |, we 
Z 


see that (—2) , Showing that the set is indeed 


| 
oooo 


4 4 
linearly dependent. 


We note that if A E€ Mkxn(F) then the number of arithmetic operations needed 
so solve a system of linear equations of the form AX = w using Gaussian elim- 
ination, is no more than ek(k — 1)Gn — k — 2) if k < n and no more than 
en[3kn + 3(k — n) — n? — 2] otherwise. Of course, if the matrix A is of a spe- 
cial form, this procedure can be much faster. For example, if A E€ Mnxn(F) is a 
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tridiagonal matrix, then a system of equations of the form AX = w can be solved 
using 3n additions/subtractions and 5n multiplications. 

If A € Mkxn(F) is a nonsingular matrix which can be written in the form LU, 
where L is lower triangular and U is upper triangular, then a system of linear equa- 
tions of the form UX = w is easy to solve using Gaussian elimination, since U 
is already in row-echelon form. Moreover, since U must also be nonsingular, this 
system has a unique solution y = U~!w. Then the system AX = w has a unique so- 
lution, which is also the solution to the system LX = y and that system too is easy 
to solve. We therefore see the importance of the LU-decomposition of matrices, 
assuming that one exists. 

Given a matrix A E€ Mxxn(F), we define the column space of A to be the sub- 
space of F* generated by the set of all columns of A. The dimension of the column 
space of A is called the rank of A. Moreover, there exists a linear transformation 
a: F” —> Fk satisfying the condition that ®gp(a@) = A, where B and D be the 
canonical bases of F” and F*, respectively, and it is clear that the column space 
of A is just im(@). Similarly, we define the row space of A to be the subspace of 
Mi xn(F) generated by the rows of A. We will show that the dimension of this 
space is also equal to the rank of A. 


Proposition 10.3 Let F be a field, let k and n be positive integers, and let 
by 
A € Mkxn(F) andletw=| : |€ F*. Then the system of linear equations 


Dx 
AX = v has a solution if and only if w belongs to the column space of A. 


dı 
Proof Ifv= | : | isa solution of the system AX = w then 
dn 
a1 154) j aij 
ce | 1 (|a 


and so w is a linear combination of the columns of A. Conversely, if we assume that 
aij dı 
there exist scalars d4, ...,d„ in F such that w = Vin dj| : |,thenv= 


akj dn 


is a solution of the given system. 


In particular, we get the following consequence of this result. 
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Proposition 10.4 Let F be a field, let k and n be positive integers, and let 
by 

A € Mexn(F) and let w= | : | € F*. Then the system of linear equations 
bk 

AX = w has a solution if and only if the rank of the coefficient matrix A is 

equal to the rank of the extended coefficient matrix. 


Now let us return to the problem of identifying the solution sets of homogeneous 
systems of linear equations. 


Proposition 10.5 Let F be a field, let k and n be positive integers, and let 
A E€ Mkxn(F) be a matrix the columns of which are vectors y1, ..., Yn in FF. 
Assume these columns are arranged such that {y,,..., yp} is a basis for the 
column space of A, for some r <n. Moreover, for allr < h <n, let us select 
scalars by, ..., bhn such that: 
C) yn = bmyi +--+ + Darr; 
(2) ban =—1; 
(3) bnj = 90 otherwise. 
bni 
For eachr < h <n, let vn = : € F”. Then {vr41,..., Un} is a ba- 
Dhn 
sis for the solution space of the homogeneous system of linear equations 
0 
AX= 


(Comment before the proof: Since {y1,..., Yn} is a set of generators for the col- 
umn space of A, it contains a subset that is a basis. The assumption that this is 
{y1,---, Yr} is for notational convenience only.) 


Proof If r =n then the solution space of the system of linear equations is : 

0 

and so the result is immediate. Hence let us assume that r < n. If r < h <n, then 
0 


Avy = et bnjyj — Yh = | : |, and so each v, belongs to the solution space 


0 


of AX = | : |. Moreover, the set {v,;+1,..., Un} is linearly independent, since if 
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0 
y cjvj = | : | then for each r < h <n we note that the hth entry on the 


0 
left-hand side is —c whereas the corresponding entry on the right-hand side is 0, 
proving that cp = 0 forallr < h <n. 
We are therefore left to show that {v,+1, ..., Un} is a generating set for the so- 
dı 
lution space of the given homogeneous system. And, indeed, let w = | : | be 
dn 
el 
a vector in this solution space. Then w + ss 4141p = | : |, where e,41 = 
en 
+++ = en = 0. Therefore, this vector belongs to solution space of the system, and 
0 
so a enyn = | : |. However, since the set {y1,..., yr} is linearly independent, 
0 
this implies that e; = --- = e, = 0 as well. Therefore, w = — a ere djvp, show- 
ing that {v,+41,..., Vn} is a generating set for the solution space, as required. 


As an immediate consequence of Proposition 10.5, we obtain the following re- 
sult. 


Proposition 10.6 Let F be a field, let k and n be positive integers, and let 


A € Mxxn(F). Then the dimension of the solution space of the homogeneous 
0 


system of linear equations AX = | : | isn —r, where r is the rank of the 


coefficient matrix A. 


We are now ready to prove the characterization of rank which we mentioned 
before. 


Proposition 10.7 Let F be a field, let k and n be positive integers, and let 
A € Mkxn(F). Then the rank of A equals the dimension of the row space 
of A. 


Proof Let v1,..., ug be the rows of A, which generate a subspace of Mj xn(F). 
We can reorder these rows in such a way that {v1,..., vr} is a basis for the row 
space, for some 1 < t < k. This, as we know, does not change the solution space 
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0 
of the homogeneous system of linear equations AX = | : | and hence does not 


0 
change the rank ra of A. Let B € M;,,(F) be the matrix obtained from A by 
deleting rows tf + 1,...,k. The columns of B belong to F* and so the rank rg of B 


satisfies rg < t, which implies that n — t < n — rg. But we have already seen that 
0 0 


the homogeneous systems of linear equations AX = | : | and BX = | : | have 


0 0 
the same solution space and so, by Proposition 10.6, n — t <n — ra. From this we 
conclude that r4 < t. We have thus shown that the rank of any matrix is less than 
or equal to the dimension of its row space. In particular, this is also true for A’. 
But the rank of AT is t, while the dimension of its row space is r4, and so we have 
t <rg as well, proving equality. 


Example Let us find a basis for the solution space of the system of linear equations 
XxX 


[i T | X2 = H over IR. We know that the coefficient matrix is 


1 1 1 1]] X3 
X4 
; ; 0 5 : 
row-equivalent to the matrix 01 -4 o| P reduced row echelon form, and 
this matrix has rank 2. Therefore, the solution space of the system has dimension 
=) -1 
eed 4 0 : ‘ : 
4 — 2 = 2. Indeed, it is easy to check that BE 0 is a basis for this 
0 1 


solution space. 


Gaussian elimination requires an order of magnitude of n? arithmetic operations 
to solve a system of n linear equations in n unknowns. This computational overhead 
is quite significant if n is large (say, over 10,000), even with the use of supercomput- 
ers. As a result, there is considerable continuing research into finding faster methods 
of computation, especially in those cases in which we have additional information 
on the structure of the matrix of coefficients, originating in knowledge of the par- 
ticular problem from which the system arose. Often this structural information is 
immediately noticeable, but sometimes it appears only after a sophisticated consid- 
eration of the problem. 


Example It is often possible to show that the matrix we are interested in, while not 
itself having a special structure, is equal to the product of two matrices having a spe- 
cial structure, a situation which arises in many mathematical models. Let us consider 
one such case. Ann x n symmetric Toeplitz matrix is a matrix B = [bij] € Mnxn(R) 
satisfying the condition that there exist real numbers co, ... , Cn—1 such that bj; = cy 
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ONFE 
NRW 
eNO 
NON 


whenever |i — j| = h. Thus, for example, the matrix is asymmetric 


702 1 

Toeplitz matrix. Clearly, the set of all symmetric Toeplitz matrices is a subspace of 
Mnxn(R). However, it is not a subalgebra, since the product of two such matrices 
need not be a symmetric Toeplitz matrix. They are also convenient to store in a com- 
puter, since we need to keep in memory only the n scalars co, ...,Cn—1. Note that 
symmetric Toeplitz matrices are symmetric with respect to both main diagonals. 

Many mathematical models in economics are built around solving systems of lin- 
ear equations of the form AX = w, where A is a product of two symmetric Toeplitz 
matrices—a fact which emerges from a knowledge of economic theory. 


íl 

The proper use of mathematical techniques, and especially computational tech- 
niques, also depends very much on a deep understanding of the particular problem 
one is dealing with. Also, it is crucial to emphasize once again that any method 
we use to solve a system of linear equations on a computer will induce errors as a 
result of roundoff and truncation in our computations. With some methods—such 
as Gaussian elimination—these errors tend to accumulate, whereas with others they 
often cancel each other out, within certain limits. It is therefore necessary, espe- 
cially when we are dealing with large matrices, to have on hand several methods 
of handling such systems of equations and to be able to keep track of the way in 
which errors can propagate in each of the different methods at one’s disposal. The 


matter of the numerical stability of solutions to such systems was investigated by 
Wilkinson, among many others. 


With kind permission of the Archives of the Mathematisches Forschungsinstitut Ober- 
wolfach. 


Otto Toeplitz was a twentieth-century German mathematician who 
studied endomorphisms of infinite-dimensional vector spaces. 


© Sergei Vostok (Faddeev); © Dr. Vera Simonova (Fad- 
deeva). 

The problem computing solutions of systems of 
linear equations was the subject of considerable 
research in the early days of computers. Among 
the contributors were the Russian husband-and- 
wife team of Dimitri Konstantinovich Faddeev 
and Vera Nikolaevna Faddeeva. 


The following is a useful trick which we will need later. 
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Proposition 10.8 Let F be a subfield of a field K. Let k and n be positive 
integers and let A € Mxxn(F). Suppose that there exists a nonzero vector 
X1 0 


x=]: |/ek ” satisfying Ax = : |. Then there exists a nonzero vector 


Xn 0 


y € F” satisfying Ay = 


Proof Let V = F{x1,...,xn}, which is a subspace of K, considered as 
a vector space over F. Let E = {vj,..., up} be a basis for V over F and set 
vi 
v= | : | € KP. Then there exists a nonzero matrix B € Mnxp(F) satisfying 
Up 
0 
Bv = x and so ABv = Ax =| : |. But E is linearly independent and so we must 


0 
have AB = O. Now take y to be any nonzero column of B. 


We now turn to iterative methods of solution of systems of linear equations. For 
simplicity, we will assume that our field of scalars is always R. The basic idea is, 
as we have already noted, to guess a possible solution and then use this initial guess 
to compute a sequence of further approximations to the solution which, hopefully, 
will converge (in some topology) with relative rapidity. Usually, the initial guess 
is based on knowledge of the real-life problem which gave rise to the system of 
equations, something that can often be done with good accuracy. In very large and 
computationally-difficult situations (for example, weather prediction, chip design, 
large-scale economic models, computational acoustics, or the modeling the chem- 
istry of polymer chains), one can even use Monte Carlo methods, based on statistical 
sampling and estimation techniques, to come up with an initial guess or even an ap- 
proximate solution. 

To illustrate this approach, let us consider the problem of solving a system of 
linear equations of the form AX = w, where A = [aij] E€ Mnxn(R) is a nonsin- 

bi 
gular matrix and w = | : | € R”. We know that this system has a unique so- 
bn 
lution, namely A~!w, but inverting the matrix A may be computationally time- 
consuming and prone to error, so we are looking for another method. Suppose that 


we can write A = E — D, where E is some matrix which is easy to invert. Then if 
v € R” satisfies Av = w, we know that Ev = Dv + w and so v = ET! (Dv + w). 
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We now guess a value for v, call it v. Then, using this formula, we can define 
new vectors v”, v), ... iteratively by setting v® = E~!(Dv"— + w) for each 
h > 0. This can be done relatively quickly since, by assumption, E~! was relatively 
easy to compute and, having computed it once for the first step of the iteration, 
we don’t need to recompute it for subsequent steps. Our hope is that the sequence 
vy), v,... will in fact converge. Indeed, if this sequence does converge to 
some vector v then it is easy to verify that v must be the unique solution of AX = w. 

For example, let us assume that the diagonal entries a;; of A are all nonzero, 
and let us choose E to be the diagonal matrix having these entries on the diag- 
onal. Then ÆT! is also a diagonal matrix having the entries aj; 1 on the diago- 


cO 
nal. If our initial guess is v® = : |, then it is easy to see that for h > 0 we 
O 
cH 
have v™® = : |, where ere = a; [bi = ii ajc,” | for all 1 <i <n. 
af 


This method is known as the Jacobi iteration method. Another possibility, again 
under the assumption that the diagonal entries aj; of A are all nonzero, is to choose 
E to be the upper-triangular matrix [e;;] defined by setting ej; = aij if i < j. 


oO ol 

Given an initial guess yO = : |, we see that v™® = : for h > 0, where 
0 h 
of) cl? 


h+1 1 j—1 h+1 h ; ; 
cj lz a; [bi — jai dpe, = pare, aije) °] for all 1 < i <n. This method 
is known as the Gauss—Seidel iteration method, since it was discovered indepen- 


dently by Gauss and by Jacobi’s student Philipp Ludwig von Seidel. 


With kind permission of the Archives of the Mathematisches Forschungsinstitut Ober- 
wolfach. 

Carl Gustav Jacob Jacobi was a nineteenth-century German math- 
ematician, who worked mostly in analysis and applied mathematics. 
His work in astronomy led him to solve large systems of linear equa- 
tions, and his papers on determinants helped make them well-known. 


In both of the above methods, and in other iteration methods (and there are many 
of these), there is no guarantee that the sequence of approximations will always 
converge or that, even if it does converge, it will do so rapidly. Understanding the 
conditions for convergence and analyzing the speed of convergence requires so- 
phisticated techniques in numerical analysis, and indeed there are many examples 
of matrices for which one iteration scheme converges whereas another doesn’t, as 
well as various necessary and sufficient conditions for a given iteration method to 
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converge. For example, a sufficient condition for the Jacobi iteration method to con- 
verge for a matrix A = [a;;] is that, DL jsi laij| < |aji| for all 1 <i <n. It is also 
known that if the matrix A is tridiagonal, then the Jacobi method converges if and 
only if the Gauss—Seidel method converges, but the latter always converges faster. 


The convergence and accuracy of the Gauss-Seidel iteration method 
was studied in detail by the Russian mathematician and engineer 
Alexander Ivanovich Nekrasov at the beginning of the twentieth cen- 
tury, long before the use of electronic computers. 


Example Let A = | — 


ort 


2 1 7 
1 2 | €e M3x3(R) and let w = | 2 |. The system of 
1 3 4 


1 
linear equations Ax = w has a unique solution | | |. If we use the Jacobi iteration 
1 


0 
method beginning with the initial guess v = | 0 |, we get the sequence of vectors 
0 
(written to six-digit accuracy): 
0 1.75000 0.41667 1.04167 0.96528 
0j, 2.00000 |, 1.08333 |, 1.08333 |, 1.09722 |, 
0 1.33333 0.66667 0.97222 0.97222 
0.95833 0.99768 0.99016 0.99614 
1.02083 | , 1.02314 |, 1.01157 |, 1.00559 | , 
0.96759 1.01157 0.99228 0.99614 
0.99816 0.99853 
1.00386 | , 1.00190 |, 
0.99814 0.99871 


and if we use the Gauss-Seidel iteration method with the same initial guess, we get 
the sequence of vectors (written to six-digit accuracy): 


0 1.75000 1.04167 1.06944 1.01504 
0], —0.66667 |, 0.63889 |, 0.80093 |, 0.93672 |, 
0 1.33333 1.55556 1.12037 1.06636 
1.00829 1.00264 1.00113 1.00040 

0.97287 |, 0.99020 | , 0.99611 |, 0.99853 |, 


1.02109 1.00904 1.00326 1.00129 
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1.00016 1.00006 
0.99943 |, 0.99978 |, 
1.00049 1.00019 


so we see that both methods converge, albeit quite differently. 


1 2 0 0 
Example Let A=|2 1 2 | € M3 x3(R) and let w = | —1 |. The system of 
0 2 1 3 
—2 
linear equations Ax = w has a unique solution 1 |. If we try to solve this system 
0 
using the Gauss—Seidel method with the initial guess | 0 |, we get the sequence of 
0 
0 2 30 254 
vectors | —1 |, | —15 |, | —127 |, | —1023 |,... which clearly diverges. 
5 33 257 2049 


A more sophisticated iteration technique is, at each stage, not to replace v® 
by the computed v+» but rather by a linear combination of the form rv“+) + 
(1 —r)v, where r € R is a relaxation parameter. Doing this with Jacobi iteration 
gives us the Jacobi overrelaxation (JOR) method, and doing it with the Gauss— 
Seidel method gives us the successive overrelaxation (SOR) method. The relaxation 
parameter r is chosen on the basis of certain properties of the matrix A. By choos- 
ing this parameter wisely, one can often achieve a considerable improvement in 
convergence. For the JOR method, one normally chooses 0 <r < 1. In 1958, Ka- 
han showed that the SOR method does not converge for r outside the open interval 
(0, 2). 


© Neville Miles, Imperial College 
London (Southwell); With kind 
permission of the Archives of the 
Mathematisches Forschungsinsti- 
tut Oberwolfach (Kahan, Young). 
Relaxation methods were 
first developed by the 
twentieth-century British 
mathematician Richard V. 
Southwell. Contemporary Canadian mathematician William Kahan has made major con- 
tributions to numerical analysis and matrix computation. The optimal relaxation parameters 
for the SOR method were calculated by the twentieth-century American mathematician 
David M. Young, Jr. 


As a rule of thumb, iteration methods work best for large sparse matrices, such 
as those arising from the solution of systems of partial differential equations. As 
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previously remarked, in iteration methods truncation and roundoff errors tend to 
cancel each other out, rather than accumulate. While sparse matrices arise in many 
applications—as circuit simulation, analyses of chemical processes, and magnetic- 
field computation—there are also important situations, such as the matrices arising 
in radial-basis function interpolation, a technique of great important in computer 
graphics, which lead to very large matrices almost all entries of which are nonzero. 

The Jacobi, Gauss-Seidel, JOR, and SOR methods are examples of iteration 
methods of the form v“@+) = auv”), where q is an affine transformation of R” 
that does not depend on h. Such methods are known as stationary iteration meth- 
ods. In a later chapter, we shall also mention some iteration methods which are not 
stationary. 


Example In the beginning of this chapter, we saw an example of how a nonlinear 
system of equations can be turned into a linear system. This can often be done 
in more general cases, producing large systems of linear equations of the form 
AX = w, where the matrix A is usually sparse and for which iteration methods are 
therefore appropriate. Consider, for example, the problem of finding real numbers 
a, b, and c such that the following conditions hold: 


@ -b +e = 6, 
ab + ac + 4bc = 29, 
a’ + 2ab — 2bc = —7, 
2a? — 3ab +c? =5, 
b? —c* +5ab=5, 
2ac — 3b” = —6. 
To linearize this, we begin by assigning variables to all of the terms appearing in the 


equations: X; = a’, X2 = b°, X3 = c°, X4 =ab, X5 = ac, and X6 = bc. This then 
yields the system of linear equations 


1-1 1 00 0 6 
0 0 0 11 4 29 
L o o 2.0 =a). |= 
2 0 1-30 0 5 
0 1-1 50 0 5 
0-3 0 02 0 —6 


which has a unique solution X; = 1, X2 = 4, X3 = 9, X4 = 2, X5 = 3, and X6 = 6, 
from which we deduce that a = 1, b = 2, and c = 3. 


The iterative methods we have discussed so far are all linear, in the sense that 
they involve only methods of linear algebra. There are, however, also families of 
nonlinear iterative methods, involving the calculus of functions of several variables, 
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of which one should be aware. These include gradient (steepest-descent) methods 
and conjugate-direction methods. A discussion of these methods is beyond the scope 
of this book. 

Finally, another important warning. When we attempt to solve systems of linear 
equations on a computer, it is important to remember that the system may be very 
sensitive, and small changes in the entries of the coefficient matrix may lead to 
large changes in the solution. Such systems are said to be ill-conditioned. Applied 
mathematicians and others who design mathematical models often take considerable 
pains to avoid creating ill-conditioned systems. 


7 7 8 10 
5 5 6 7 . a i 
Example Let A = 6 9 10 8 € Ma4x4(R). This matrix is nonsingular, 
5 10 9 7 
41 68 —17 10 32 
a: —6 10 -3 2 23 
with inverse equal to 10 -17 5 3l Let w = 33 | Then the sys- 
25 —41 10 —6 31 
1 
tem of equations AX = w has a unique solution . However, we also note that 
1 
—7.2 32.10 
A =0.1 | _ | 22.90 
2.9 | | 32.90 
6.0 31.10 
Example Consider the system of linear equations 
1 —10 (0) 0 0 0 —9 
(0) 1 —10 0 0 0 —9 
(0) 0 1 —10 0 0 x= —9 
0 0 (0) 1 —10 0 ~ | —9 
0 0 (0) 0 1 —10 —9 
0 0 (0) (0) (0) 1 1 


over R. This system has a unique solution, namely . However, if we alter the 


pmb ph pi ph pd pak 


210 10 Systems of Linear Equations 


coefficient matrix by changing the (6, 6)-entry to rin (which is roughly equal to 
101 
11 


0.9990009), we will obtain a completely different solution, namely 


Since real-life computations are based, as a rule, on numbers gathered through 
some sort of a measurement process, which is, as a matter of fact, not completely 
accurate and certainly beyond our control, it is extremely important to know how 
sensitive the system is to possible small variations in the values of the entries. The 
numerical analysis of matrices deals extensively with this issue, and here we can 
only present a simplistic measure of this sensitivity for nonsingular square matrices 
over R. To any matrix A = [a;;] € Mnxn (R), we will assign the number 0 (A) de- 
fined by (A) = max) <j<n{-/_ laij|}. The number 6(A)0(A~!) is the condition 
number of the matrix A. Note that A has the same condition number as A~! and as 
cA, for any 0Æ#c ER. 


With kind permission of the American Mathematical Society. 


Condition numbers were introduced by John von Neumann, one of 
the great mathematical geniuses of the twentieth century, who con- 
tributed to practically all branches of mathematics—pure and applied. 
Von Neumann was a major force in the introduction of digital comput- 
ers after World War II and the development of numerical methods for 
them. 


The condition number can be written in the form g x 10’. where 0.1 < g < 1. If 
t > 0 then, as a rule of thumb, one can expect that the solution of a system of linear 
equations AX = w will have f significant digits fewer than that of the entries of A. 
Thus, if A is the matrix in the previous example, then 6(A) = 11. Moreover, 


1 10 100 1000 10000 100000 
O 1 10 100 1000 10000 
Ata 0 0 1 10 100 1000 
0 0 0 1 10 100 
0 0 0 0 1 10 
0 0 0 0 0 1 


and so (A7!) = 111,111. Therefore 6(A)0(A7!) is roughly 12 x 10’, and so we 
cannot, as we have seen, expect any accuracy in our solution, if we assume our data 
is only good to 6-digit accuracy. 

888445 887112 
887112 885871 


countered, has condition number roughly equal to 0.39 x 10°. 


Similarly, Nievergelt’s matrix | i which we have already en- 
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Of course, computing the condition number of a given matrix may also be a prob- 
lem, since it involves calculating A~!. Fortunately, there are many fairly-efficient 
condition number estimators, algorithms that give a good estimate of the condition 
number of a matrix with relatively low computational overhead. 

Various techniques, going under the collective name of preconditioning tech- 
niques, are also often used to increase the speed of convergence and accuracy of 
various iterative methods. A discussion of these can be found in any advanced book 
on numerical matrix computation. 


Exercises 

Exercise 579 
3 4 1 1 0 1 

Are the matrices | —2 —4 —6|and|O 1 1 |in M3,3(R) row equiva- 

5 2 7 0 0 0 

lent? 

Exercise 580 
123 4 

Bring the matrix | 1 2 4 3 | € M3x4(R) to reduced row echelon form. 
23 1 4 


Exercise 581 


Let F = GF(5). Bring the matrix € M3x4(F) to reduced row 


eS NOR 
NUN 
hee 
oro 


echelon form. 


Exercise 582 
Solve the system of linear equations 


(3 —i)X,+ 2—i1)X2+ (44 21) X3 =2 + 6i, 
(4+ 31)X; — (5 +i)X2 + (1 +i)X3 =242i, 
(2 —3i)X1 + (1 —i)X2 + (2 + 4i)X3 = 5i 
over C. 


Exercise 583 
Solve the system of linear equations 


Xı +2X2 +4X3 =31, 
5X1 + X2 +2X3 = 29, 
3X, — X2+ X3=10 


over R. 


212 10 Systems of Linear Equations 


Exercise 584 
Solve the system of linear equations 


3X1 +4X2 + 10X3 = 1, 
2X, +2X2+2xX3=0, 
X,+X%2+5X3=1 
over GF(11). 


Exercise 585 
Find all solutions to the system 


123 4 Xı 5 
2 12 3 Xo] 1 
3 2 1 2 X; | 1 
43 2 1 X4 —5 
over R. 
Exercise 586 
Find all solutions to the system 
123 4 5 Xı 13 
2 1234 Xa 10 
22 12 3 X3 |=] 11 
222 1 2 X4 6 
2222 1 X5 3 
over R. 
Exercise 587 
Find all solutions to the system 
1 1 1 = 1 
11 0 P =|0 
3 
0 0 1 Xa 1 
over GF(2). 
Exercise 588 
Find all solutions to the system 
1 3 2 0 
2 <1 3 = |0 
3 -5 4 z. =o 
1 17 4 : 0 


over R. 
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Exercise 589 
Find a real number a so that the system of linear equations 


2X1 —X2+X3+X4=1, 
Xı +2X2 — X3 + 4X4 =2, 
Xı +7X2 — 4X3 + 11X4 =a 
has a solution over R. 


Exercise 590 
Find all real numbers c such that the system of equations 


X,+X2—-X3=1, 
X,+cX%2+3X3=2, 
2X, +3X2+cX3=3 


has a unique solution over R; find those real numbers c for which it has infinitely- 
many solutions over R; find those real numbers c for which it has no solution 
over R. 


Exercise 591 
Solve the system of linear equations 


X,+2X2+ X3=1, 
X,+X2+X3=0 


over GF(3). 


Exercise 592 
Solve the system of linear equations 


X + (V2)X2 + (V2)X3 = 3, 
Xı + (14+ ¥2)X2+ X3 =3 + V2, 
X1 + X2—(V2)X3=44+ V2 
over Q(/2). 


Exercise 593 
Solve the system of linear equations 


4X, -3X2 =3, 
2X; = X2 +2X%; =1, 
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3X, +2X3=4 
over GF(5). 


Exercise 594 
Solve the system of linear equations 


4X, +6X2+2X3=8, 


Xı — aX — 2X3 = —5, 
7X, +3X2 + (a — 5)X3 = 7 


over R, for various values of the real number a. 


Exercise 595 

For a given real number a, solve the system 
aXı + Xı+ X3 =l, 
X,;+aX2+ X3=1, 
X1+X2+ax3=1 


over R. 


Exercise 596 
For a given a € R, does the system of linear equations 
aXı + X2+2X3 =0, 
Xı— X2 +aX3 =l, 
Xı+X2+X3=1 


have a unique solution in R? 


Exercise 597 
Let a be an element of a field F. Find the set of all solutions to the system of 
linear equations 
Xı + X2 +4aX3 =a, 
X,+aX2—X3=1, 
X,+X2-X3=1 


over F. 
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Exercise 598 
For which a € Q does the system of linear equations 


X,+3X2 —2X3=2, 
3X, +9X2 — 2X3 =2, 
2Xı +6X2 + X3 =a 


have a unique solution in Q? 


Exercise 599 
Find real numbers a, b, c, and d such that the points (1,2), (—1, 6), (—2, 38), 
and (2, 6) all lie on the curve y = axt + bx? + cx? + d in the Euclidean plane. 


Exercise 600 
Find a polynomial p(X) = a2X* + a,X + ao € R[X] satisfying p(1) = —1, 
p(—1) =9, and p(2) = —3. 


Exercise 601 
Find a polynomial p(X) = a3X? +a2X? +a1 X +ao € R[X] satisfying p(0) = 2, 
p(2) =6, p(4) =3, and p(6) = —S. 


Exercise 602 
Let F = GF(13). Find a homogeneous system of linear equations over F satis- 
fying the condition that its solution space equals 


2 8 7 
1 3 6 
F 9 |,|] 10],}] 2 
7 5 11 
4 12 7 


Exercise 603 


Let F be a field. Let b € F and let A = |! 4%! | € M2x3(F) be a 
a21 a22 A23 


matrix satisfying the condition that the sum of the entries in each row and each 
column of A equals b. Show that b = 0. 


Exercise 604 
Let p(X) = X5 — 7X? + 12 € Q[X]. Find a polynomial q (X) € Q[X] of degree 
at most 3 satisfying p(a) = q (a) for all a € {0, 1, 2, 3}. 
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Exercise 605 
Find the rank of the matrix 


1 -1 2 3 4 
1 -l 2 0 
-1 2 1 1 3 | € M5sx5(R). 
1 5 -8 -5 -12 
3 -7 8 9 13 
Exercise 606 
Find the rank of the matrix 
1 0 1 0 0 
1 1 0 0 0 
Oo 1 1 0 0} €Msx5(R). 
0 0 1 -1 0 
0 1 0 1 1 
Exercise 607 
1 1 0 
Let F = GF(2). Find the rank of the matrix | 0 1 1 |€ M3x3(F). 
1 0 1 
Exercise 608 
Let F = GF(5). Find the rank of the matrix 
1 23 4 a 
4 3 a 12 
2 3a 2 4a 1 
for various values of a € F. 
Exercise 609 
Do there exist a lower-triangular matrix L and an upper-triangular matrix U in 
1 -1 2 
M3 x3(Q) satisfying the condition LU =| 2 —1 3]? 
0 1 8 


Exercise 610 
Let F = GF(5). For which values of a € F do there exist a lower-triangular 
matrix L and an upper-triangular matrix U in M3,.3(F) satisfying the condition 
1 1 a 
tht LU=|4 1 0)? 
a 1 4 
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Exercise 611 


4 2 3 4 

: is 2 0 2 2 
Find the LU-decomposition of 3 4 4 5 € M4x4 (R). 

-1 0 2 3 


Exercise 612 

Let A € Mnxn(R) be a tridiagonal matrix all diagonal entries of which are 
nonzero. Can we write A = LU, where L is a lower-triangular matrix and U 
is an upper-triangular matrix, both of which are also tridiagonal? 


Exercise 613 
Let F be a field and let a, b, c € F. Find the rank of the matrix 


1 1 1 
b+c c+a a+b |eM3x3(F). 
bc ca ab 


Exercise 614 
Let F be a field and let a, b, c,d € F. Find the rank of the matrix 


a c c 
d a+b c| eE€M3x3(F). 
d d b 
Exercise 615 
3 1 4 
Find the rank of the matrix i h 17 ; €e M4x4(R) for various values of 
2 2 3 


the real number a. 


Exercise 616 


a -1 2 1 
Find the rank of | —1 a 5 2 | €e M3x4(Q) for various values of the ra- 
10 -6 1 1 


tional number a. 


Exercise 617 
Find the set of all real numbers a such that the rank of the matrix 


a 1 1 
1 —1 | € M3x3R) 
1 a 


equals 2. 
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Exercise 618 

Let n be a positive integer and, for each A € Mpnxn(R), let r(A) be the rank of A. 
Define a relation < on Mpnxn(R) by setting B < A if and only if r(A — B) = 
r(A) — r(B). Is this a partial order relation? 


Exercise 619 

Let F be a subfield of a field K. Let k and n be positive integers and let 
A € Mkxn(F) be a matrix having rank r. If we now think of A as an element of 
M«kxn(K), is its rank necessarily still equal to r? 


Exercise 620 
1 7 17 3 
: 4 4 8 6 PE 
Find k € Z such that the rank of 3 1 14 € M4x4(Q) is minimal. 
2k 8 20 2 


Exercise 621 

Let F be a field and let k and n be positive integers. For a matrix A E€ Mkgxn(F) 
having rank h, show that there exist matrices B E€ Mgxn (F) and C E€ Mixn(F) 
such that A = BC. 


Exercise 622 

Let k and n be positive integers and let F be a field. For matrices A, B € 
Mkxn(F), show that the rank of A + B is no more than the sum of the ranks 
of A and of B. 


Exercise 623 

Let k and n be positive integers and let F be a field. Let A, B E€ Mkxn(F) be 
matrices satisfying the condition that he row space of A and the row space of B 
are disjoint. Does it follow from this that the rank of A + B equals the sum of the 
rank of A and the rank of B? 


Exercise 624 
Find bases for the row space and column space of the matrix 


1 2 =} =7 =2 


=1 =2 1 1  0|eM3xs(R). 
1 2 0 2 1 


Exercise 625 
Find matrices P, Q € M3x3(R) satisfying 


i: 23 1 
P|2 —2 1/Q=|0 
3 04 0 


oro 
ooo 
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Exercise 626 
1 2 0 
Write the rows of the matrix A=| i—1 2 i | €M3,3(C©) as linear com- 
(0) 2 -i 
binations of the rows of AT. 
Exercise 627 
1234 57! 
01234 
Calculate | 0 0 1 2 3 €Ms5x5(R). 
000 1 2 
000 0 1 
Exercise 628 


Let k and n be positive integers and let F be a field. Let A = [ A o] be a matrix 


in Mkxn(F), where B is a nonsingular matrix in M,xr(F) for some 1 <r < 
min{k, n}. Show that the rank of A equals r if and only of DBT!C = E. 


Exercise 629 
Let F be a field and let a, b, c be distinct elements of F. Furthermore, let d, e, f 
be distinct elements of F. What is the rank of the matrix 


l a d ad 
1 b e be | € M3x4(F)? 
l c f cf 


Exercise 630 

Let k and n be positive integers and let F be a field. Let A € Mkxn(F) and let 
w € F* be such that the system of linear equations AX = w has a nonempty set 
of solutions and that all of these solutions satisfy the condition that the Ath entry 
in them is some fixed scalar c. What can we deduce about the columns of the 
matrix A? 


Exercise 631 

Let n be a positive integer and let F be a field. Let O # A € Mnxn(F). Show 
that there exists a nonnegative integer k such that the rank of A” equals the rank 
of A* for all h > k. 


Exercise 632 


Let A= | —1 —1 1 | € M3x3(R). Find the condition number of A. 
1 
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Exercise 633 

Let a be a positive real number. It is necessarily true that the condition number 
a 1 a 

of A= | 0 0 —1 | € M3 x3(R) is greater than 2a + 1? 
a -l a 


Exercise 634 
Find a positive real number a for which the condition number of 


E M3x3(R) 


> 
ll 
= Ore 
Se Qe 
=a O 


is maximal. 


Exercise 635 
Does there exist a system AX = w of linear equations in n unknowns (for some 
positive integer n) over R having precisely 35 distinct solutions? 


Exercise 636 
Can one find an integer h such that the condition number of the matrix 


1 —1000 1 
1 —100 0 
1 h 1 


is greater than 10°? 


Determinants 1 1 


Let F be a field and let n be a positive integer. We would like to find a function 
from Mnxn(F) to F which will serve as an oracle of singularity, namely a function 
that will assign a value of 0 to singular matrices and a value other than 0 to nonsin- 
gular matrices. Indeed, let F be a field and let n be a positive integer. A function 
ôn : Mnxn (F) > F is a determinant function if and only if it satisfies the following 
conditions: 
CQ) ôa) = 1; 
(2) 6,(A) = Oif A is a matrix having a row all of the entries of which are 0; 
(3) 6,(Eij A) = —ôn (A) for all 1 <i Aj <n; 
(4) ôn(Eij;c A) = ôn (A) for all 1 <i # j <nandallce F; 
(5) ôn(Ei;c A) = côn (A) forall 1 <i<nandallO04¢ceF. 

In particular, we note that for each 1 <i # j < n andall c € F we have 6,(E;;) = 
—1=8n (EF), n (Eije) = 1 = ôn (Ef), and bn (Eic) = ¢ = ôn (EF). 

We have yet to show that such functions exist for all values of n, but certainly 
they exist for a few small ones. 


Example For n = 1, the function ô; : [a] > a is a determinant function. For n = 2, 
a11 


the function 52: 
a21 


a12 : . : 
j > a11da22 — a12421 is a determinant function. 
22 


As an immediate consequence of parts (1) and (5) of the definition, we see that 
if A = [aij] € Mnxn(F) is a diagonal matrix and if 6, : Mnxn(F) > F is a deter- 
minant function, then ôn (A) = []}_) aiin (D) = [ [fq aii- 

We now want to show that for each positive integer n there exists a determi- 
nant function ôn : Mnxn(F) —> F, and indeed that this function is unique. We will 
first establish the uniqueness of these functions and check some of their properties, 
holding off on existence until later in this chapter. 


Proposition 11.1 Let F be a field. For each positive integer n there exists at 
most one determinant function ôn : Mnxn(F) > F. 


J.S. Golan, The Linear Algebra a Beginning Graduate Student Ought to Know, 221 
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Proof Let us assume that 6, : Mnxn (F) > F and n, : Maxn(F) —> F are deter- 
minant functions and let 6 = ny, — ôn. Then the function £ satisfies the following 
conditions: 

(1) BU) =0; 

(2) (A) =0 if A is a matrix having a row all of the entries of which are 0; 

(3) B(EijA) = —B(A) forall 1 <i Aj <n; 

(4) B(Eij:ic A) = B(A) forall 1 <i # j <nandallce F; 

(5) BCE;.-A) =cB(A) forall 1<i<nandallO4¢ceF. 

In particular, if A € Mnxn(F) and E is an elementary matrix, then (A) and B(E A) 
are either both equal to 0 or both of them are different from 0. But for any matrix 
A we know that there exist elementary matrices E1, ..., Er in Mnxn(F) such that 
either E,---E,;A = I or E,---E;,A is a matrix having at least one row all of the 
entries of which equal 0. Therefore, 8(A) = 0 for every A E€ Myxn(F). Thus £ is 
the zero-function, and so ôn = Nn. 


Proposition 11.2 Let F be a field and let n : Mnxn(F) —> F be a determi- 
nant function. Then ôn (A) 4 0 if and only if A is nonsingular. 


Proof If A is nonsingular, there exist elementary matrices E1, ..., Er in Mnxn(F) 
such that E; --- EA = I, and so, by the definition of the determinant function, 
bn(A) = côn (I) = c, where 0 Æ c e F, and so 6,(A) 4 0. Now assume that 
6n(A) Æ 0 and that A is singular. Then there exist elementary matrices E1, ..., E; 
in Mnxn(F) such that E,---E;A is a matrix having at least one row all of the 
entries of which equal 0. But then, for some 0 Æ c € F, we have 0 Æ 4,(A) = 
Côn (E1 --- E;A) = c0 = 0, which is a contradiction, proving that A must be nonsin- 
gular. 


Thus we see that the determinant function, to the extent it exists, is the oracle we 
are seeking. 


Example The subset | E B | i [~ ae 


c+di a — bi 
a+bi —c+di 
c+di a — bi 


|| of C? is linearly dependent if and 


only if A= | | € M2x2(C) is singular. We have already noted 


a a : : : i 
that 52: | if | > a11a22 — 412421 is a determinant function, and so this hap- 
a21 a22 


pens if and only if 62(A) = a? + b? +c? + d? =0, i.e., if and only if a = b = c = 
d=0. 


Proposition 11.3 Let F be a field and let n : Mnxn(F) —> F be a determi- 
nant function. If A is a matrix in Myy,(F) having two identical rows then 
ôn (A) =0. 
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Proof Suppose that rows h and k of A are identical. First, assume that the character- 
istic of F is other than 2. Then A = Epg(A) and so ôn (A) = bn (Enk A) = —ôn (A), 
which implies that ô„ (A) = 0. If the characteristic of F equals 2 then ô„ (A) = 
On(Eng:1 A), and Epng;1A is a matrix having a row in which the entries of one row 
are all 0. Therefore, by Proposition 11.2, 6,(A) = 0. 


Proposition 11.4 Let F be a field and let n : Mnxn(F) —> F be a determi- 
nant function. If A, B € Mnxn(F) then 

(1) ôn (AB) = ôn (A)ôn (B); 

(2) ôn (AB) = ôn (BA). 


Proof (1) By Proposition 9.1, we know that AB is nonsingular if and only 
if both A and B are nonsingular. Therefore, ô (A) = 0 or 6,(B) = 0 if and 
only if ôn (AB) = 0. If ô (A) 4 0 Æ 6,(B) then there exist elementary matrices 
Ei,..., Et, G1, ..., Gs in Mnxn(F) such that B = E; --- E, I and A = G1 --- G;I 
and so AB = G1 --- Gs E1- -- E;I, which implies that ôn (AB) = ôn (A)ôn (B) from 
the definition of a determinant function. 

(2) This is an immediate consequence of (1), since ôn (A)ôn (B) = ôn (B)ôn (A) 
in F. 


Proposition 11.5 Let F be a field and let n : Mnxn(F) —> F be a determi- 
nant function. If A E€ Myxn(F) is nonsingular then ôn (AT!) = ô (A) !. 


Proof By Proposition 11.4, we see that ôn (A7!)6,(A) = ôn (AT!A) = ô (I) = 1 
and from this the result follows immediately. 


Proposition 11.6 Let F be a field and let ôn : Mnxn(F) > F be a determi- 
nant function. If A € Myxn(F) then: 

(1) &n(AEij) = n(A) forall 1 <i + j <n; 

(2) n(A Eij;c) = ôn (A) forall 1 <i # j <nandallce F; 

(3) n(A Ei:c) = côn (A) forall 1 <i<nandall0 Ace F. 


Proof This is a direct consequence of the definition of the determinant function and 
Proposition 11.4(2). 


Proposition 11.7 Let F be a field and let n : Mnxn(F) —> F be a determi- 
nant function. If A E€ Mnxn(F) then ôn (A) = ôn (AT). 
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Proof If A is singular then so is A’, and so 5,(A) = 0 = ô, (AT). If A is non- 
singular then there exist elementary matrices F),..., E in My xn(F) such that 
Ei- EA =1 =I = AT ET . nET., By our remarks in Chap. 9 concerning the 
transposes of elementary matrices, and by the remarks at the beginning of this 
chapter, we see that ôn (A) = ôn (E1 - - - E;A) = ôn (AT EJ --- ET) = ô, (AT) and so 
ôn (A) = ôn (A7). 


Of course, at this stage we do not know that determinant functions 45, : 
Mauxn(F) —> F even exist for the case n > 2 and so we now have to construct 
them. Let us denote the set of all permutations of the set {1, ..., n} by Sn. We note 
that any m € Sn is a bijective function from {1,...,} to itself and so there exists 
a function m~! € S, satisfying the condition that 22~! and 2~'z are equal to the 
identity function i + i. We also note that if 2,2’ € Sn then wz’ € Sn. 


Proposition 11.8 [fn is a positive integer then the number of elements of Sn 
equals n!. 


Proof Suppose we wanted to construct an arbitrary element x of S,. There are n 
possibilities for selecting x (1). Once we have done that, there are n — 1 ways of 
selecting 7 (2), then n — 2 ways of selecting 7x (3), etc. Thus, the total number of 
ways in which we can define z is n(n — 1)---l=n!. 


Now let m € S, and let 1 <i < j <n. The pair (i, j) is called an inversion with 
respect to x if and only if z(i) > w(/). That is to say, (i, j) is an inversion with 
respect to x if and only if 


i-j 
m(i)— (J) 
We will denote the number of distinct inversions with respect to x by A(z), and 


define the signum of x to be sgn(z7) = Cpo, Thus 


seat 1 if there are an even number of inversions with respect to 7, 
8 ~ | —1 if there are an odd number of inversions with respect to z. 


It is easy to check that sgn(z) = sgn(x7!) for all m € S,. If sgn(zr) = 1, the permu- 
tation zr is even; if sgn(z) = —1, the permutation x is odd. 


Example Let a € S4 be defined by 1 > 3, 2 > 4, 3 > 2, and 4 > 1. Then if we 
consider all possible pairs (i, j) with 1 <i < j <4 we get 
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(i, j) (a(i), z(j)) inversion? 


(1,2) (3,4) no 

(1,3) (3,2) yes 

(1,4) (3,1) yes 

(2,3) (4,2) yes 

(2,4) (4,1) yes 

(3,4) (2,1) yes 
and so we see that sgn(z) = —1. 


Now let n be a positive integer and let (K, e) be an associative and commutative 
unital F-algebra. Let A = [aij] € Mnxn(K). We then define the function A > |A| 
from My xn(K) to K by setting 


JA] = © sgn(r)az(1y,1 © ar(2),2 è- © Ax(n),n- 
TESy 


Note that, by the commutativity of K, if t = x! then 
dr(1),1 © 4n(2),2 © °° * @ Ag(n),n = 41,1(1) © 42,7(2) © °° * © An t(n) 


and so |A| = exes, sgn(T)a1,7(1) @ 42,1(2) © -+ © an,t(n)- Thus we see immediately 
that |A| = |A7| for every A € Mnxn(K). If K = C then, since c +d =T + d and 
cd =Td, we also see that for A = [a;;] we have |A| = |A]. Defining this function 
for an arbitrary commutative and associative unital F'-algebra is important for us, as 
we will need it in the case that K = F[X], where F is a field. 


Example If A = [aij] € M3x3(K), for an associative and commutative unital 
F-algebra (K, e), then 


|A| = a11 © a22 è 433 + 412 © a23 @ a31 + 413 © a21 © 432 


— 411 © 423 ® 432 — 413 ® 422 ® 431 — 412 © 421 © 433. 


Proposition 11.9 Let F be a field, let (K, e) be an associative and commu- 
tative unital F-algebra, and let A = [aij] € Mnxn(K). Pick 1 < h <n and 
write ahj = bhj + Cnj in K forall 1 < j <n. For all 1 < i < n satisfying 
i # h, set bij = cij = aij. Set B = [bij] and C = [cij], matrices in Mpxn (K). 
Then |A| = |B| + |C]. 
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Proof From the definition of |A|, we have 


|A| = 5 sgn(7)ai,z(1) ©: @ dh nh) @*** an n(n) 


TES), 


= 5 sgn (r )ai z(1) s d [bn zh) + Ch,x(m)] @ +++ @ an n(n) 


TESn 


= > Sgn(7)a1,2(1) @ +++ © Da,z(h) © +++ © An,x(n) 


TES) 


F a sgn(z )@1 (1) @ °° @ Ch r(h) @*** © An, n(n) 
TESy 


= |B) + |C], 


as required. 


We are now ready to prove that determinant functions, in fact, always exist. 


Proposition 11.10 For an integer n > 1 and a field F, the function 
Mnxn(F) — F defined by At |A| is a determinant function. 


Proof In order to simplify our notation, we will make the following temporary 
convention: if x € S, and if A = [aij] E€ Mnxn(F), we will write u(r, A) = 
sgn(z a1 (1) °** 4n,x(n)- Now let us check the five conditions of a determinant func- 
tion. 

(1) Clearly, u(z, I) equals 1 if x is the identity permutation and 0 otherwise, and 
so |J| = 1. 

(2) Let A be a matrix one of the rows of which has all of its entries equal to 0. 
Since a factor from each row appears in every term u(x, A), we conclude that all of 
these are equal to 0 and hence |A| = 0. 

(3) Let A be a matrix and let B = E;jA. Let p € S, be the permutation which 
interchanges i and j and leaves all of the other numbers between | and n fixed. 
Then sgn(zp) Æ sgn(z) for all x € S, and so for each x € S we have —u(z, A) = 
u(zp, A) =u(z, B). This implies that |B| = —| A]. 

(4) Let A be a matrix and let B = E;;.-A. Then B = [bnt], where bht = an, when 
h + j and 1 <t <n, and where bj; = aj; + Cair for all 1 < t <n. By Proposi- 
tion 11.9, we have |B| = |A| + |C], where C is the matrix all of the rows of which 
except the jth are identical with those of A, and where in the jth row we have 
C jr = Cait for all 1 < t < n. Then |C| =c|D| where D is a matrix in which two 
rows, the ith and the jth, are equal. If the characteristic of F is other than 2, then 
D = E;;D and so, by (3), we get |D| = —| D|, and so we get |C| = c|D| = 0 and 
we have |A| = |B|, which is what we want. Therefore, let us assume that the char- 
acteristic of F equals 2. Let p € S, be the permutation which interchanges i and 
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j and leaves all other numbers between 1 and n fixed. Let H be the set of all 
even permutations in S, and let K be the set of all odd permutations. The func- 
tion from H to K defined by mx +> pm is bijective since pz, = prz implies that 
mı = plor; = p~!pm2 = m2. Moreover, since the characteristic of F is 2 and 
since u(x, D) = u(pz, D) for all x € H, we see that u(z, D) + u(pz, D) = 0 for 
all x € H. Therefore, |D| = $ eglu 0r, D) + u(pm, D)] = 0 and this implies, 
again, that |C| = 0 and so |A| = |B. 

(5) It is clear from the definition of |A| and if B = E;.-A then |B| =c|A|. 


Thus, in summary, we see that if F is a field and if n is a positive integer, then 
there exists a unique determinant function Mnxn(F) > F, namely At |A|. We 
call the scalar |A| the determinant of the matrix A. 


With kind permission of the Archives of the Mathematisches 
Forschungsinstitut Oberwolfach (Scherk). 

Determinants were first used in the work of 
the seventeenth-century German mathematician, 
philosopher, and diplomat Gottfried von Leibnitz, 
who developed calculus along with Sir Isaac New- 
ton. The common properties of determinants were 
first studied by the nineteenth-century German 
mathematician Heinrich Scherk, and the first systematic analysis of the theory of determi- 
nants was done by the nineteenth-century French mathematician Augustin-Louis Cauchy, 
relying on the work of many mathematicians who preceded him. His work was continued 
by Cayley and Sylvester. The term “determinant” was first used by Gauss in 1801, and was 
popularized by Jacobi. 


Example Let n > 1 be an integer. If cj,...,c, are distinct elements of a field F 
and if A = [aij] E€ Mnxn(F) is the Vandermonde matrix defined by aj; = c] -I for 
all 1 <i, j <n, then it is easy to verify that |A| = i<j (cj — ci) # 0. This result 
can, in fact, be generalized. Suppose that, for 1 < h < n, we have a polynomial 
Pr(X) = pe bni Xİ € FLX] with bpp #0. Let c},..., Cn be distinct elements of a 
field F and let A = [aij] E€ Mnxn(F) be defined by a;; = pj(ci) forall 1 <i, j <n. 
Then |A| = b11 <+- bnn IEG — ci) #0. 


Example As a consequence of Proposition 11.7, we note that if n > 0 is odd and if 
A € Mnxn(F) is a skew-symmetric matrix then |A| = |AT | = | — A| = —|A| and so 
|A| = 0. Therefore, by Proposition 11.2, A is singular. If n is even, then one can use 
the definition of |A| to show that |A| = b? for some b which is a sum of products of 
the a;;. Thus, for example, 


0 a2 a33 a4 
—ay 0 a3 04 


P 
—a3 —~a3 0 a4 


= [412434 — 413424 + 414423 


—aj4 —d24 —a34 0 
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This number b is called the Pfaffian of the matrix A. Pfaffians arise naturally in 
combinatorics, differential geometry, and other areas of mathematics. 


Pfaffians were first defined by Cayley, and named in honor of Johann 
Pfaff, an eighteenth-century German mathematician whose most fa- 
mous doctoral student was Gauss. 


We now give two examples of why it was worthwhile to define |A| for matrices 
A with entries in an associative and commutative unital F-algebra, and not just a 
field. 


Example Let V be a vector space of finite dimension n over a field F and let 
B = {v1,..., vg} be a linearly-independent subset of V. Let yj,..., yg be a list 
of vectors in V. We claim that there are at most finitely-many elements a of F sat- 
isfying the condition that the list vı + ay,,..., ve + ay, is linearly dependent. To 
establish this claim, we will consider determinants of matrices over F[X]. Indeed, 
extend B toa basis D = {v,,..., Un} of V. Then, for each 1 <i < k, we can write 
yi = at cjjwj. For each 1 <i, j < k, define the polynomial p;;(X) € F[X] by 
setting 
ciX +1 ifi=y, 
Pij(X) = | cijX otherwise, 

and consider the matrix B = [p;j(X)] € Mkxk(F[X]). Then |B| is a polynomial 
q(X) in F[X], which is not the 0-polynomial since q (0) = 1. Moreover, for any 
a € F, we see that g(a) = 0 whenever the list vj + ay1,..., ug + ayp is linearly 
dependent. Since a polynomial can have only finitely-many distinct roots, this can 
happen only for finitely-many values of a. 


Example Letn > 1 be an integer and let U be an open interval of real numbers. Let 
K be the set of all functions in RY which are differentiable at least n — 1 times. 
Then K is an associative and commutative unital R-algebra which is not entire, let 
alone a field. We will denote the derivative of a function f € K by Df and, ifh > 1, 
we will denote the hth derivative of f by D” f. Given fi, ..., fn € K, the function 


fit) ht) ay hA 

(Dfi)(t) DAO as ORO 
W(fi,---,tnyitrhe . 
(OP Fe) (DB BiG) (D AA) 


is called the Wronskian of f\,..., fr. One can show that if we have W(/\,..., 
Jn) (t) £0 for some t € U then the subset { f1, ..., fn} of K is linearly independent 
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over R. The converse is false. To see this, let U be an open interval containing the 
origin, let fı : t > t°, and let fo: tt [f°]. Then {f1, f2} is linearly independent 
over R, but W(f1, f2)(t) = 0 for any t€ U. 


The insight of Josef Wronski, a nineteenth-century Polish mathemati- 
cian living in France, was obscured by his decidedly eccentric philo- 
sophical ideas and style of writing, and was recognized only after his 
death. The notion of a determinant of functions was first used by Ja- 
cobi. 


Example Let n be a positive integer equal to 2 or divisible by 4. A matrix 
A = [aij] E€ Mnxn (C) with |ajj| < 1 for all 1 <i, j < n having maximal possi- 
ble determinant (in absolute value) is known as an Hadamard matrix (though, in 
fact, such matrices were studied by Sylvester, a generation before Hadamard con- 
sidered them). For such a matrix, we have |A| = n”/?, and the entries of A are all 
+1. Indeed, a matrix A is an Hadamard matrix precisely when all of its entries 


—1 1 1 1 1 1 1 1 
1 —1 1 1 1 —1 1 —1 
JE T — 
are +1 and AA* = n1. Thus, 1 Il 1 and I i -1 I 
1 1 1 -l 1 -1 -l 1 
are Hadamard matrices. Moreover, for each t > 1, there exists an Hadamard ma- 
trix H, of size 2' x 2', defined recursively by setting Hı = A and H, = 
H;—ı H;—ı 
for each ż > 1. 
ee | 


We also note immediately that if A is an Hadamard matrix so are A? and —A. 
Hadamard matrices have important applications in algebraic coding theory, espe- 
cially in defining the error-correcting Reed—Muller codes. Needless to say, the deter- 
minants of Hadamard matrices get very big very quickly. If A is a 16 x 16 Hadamard 
matrix, then |A| = 4,294,967,296 and If B is a 32 x 32 Hadamard matrix, then 
|B| = 1,208,925,819,614,629, 174,706,176. 


We still are faced with the problem of actually computing the determinant of an 
n x n matrix A, especially when n is large. If we work using the definition, we 
see that we must add n! summands, each of which requires n — 1 multiplications. 
The total number of arithmetic operations need is therefore (n — l)n! + (n! — 1) = 
n(n!) — 1, which is a huge number even if n is relatively small. For example, if 
we are using a computer capable of performing a billion arithmetic operations per 
second, it would take us 12,200,000,000 years of nonstop computation to compute 
the determinant of a 25 x 25 matrix, based on the definition. Thus we must find 
better methods of computing determinants, a task which became a high priority for 
many nineteenth-century mathematicians. 
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Example Let A = [aij] € Mnxn(F) be a matrix in which aj; 4 0. Then Chid, 
Dodgson, and others showed that |A| = a Bi, where B € M n-1)x(n-1)(F) is 
the matrix obtained from A by erasing the first row and first column and replacing 


each other a;; by = eel . Thus, for example, 
i ij 
1 2 1 3 1 4 
123 4 8 7 8 6 8 5 
8 7 6 5|_||I 2 1 3 1 4 
1 8 2 7| |\1 8 1 2 1 7 
TEA 1 2 1 3 1 4 
3 6 3 4 3 5 
—9 -18 -27 
=| 6 -l1 3 
0 -5 -7 
= —144. 


This method can, of course, be iterated. The method of evaluating determinants in 
this way is known as the method of condensation. 


© George E. Andrews (Andrews). 


During the nineteenth century, matrix theory 
and the theory of determinants attracted many 
gifted mathematicians and mathematical ama- 
teurs. Felice Chiò was a nineteenth-century Ital- 
ian mathematician and physicist. On the other 
hand, Rev. Charles Lutwidge Dodgson was an 
amateur who is better known by his pen name 
Lewis Carroll, the author of Alice in Wonderland. Dodgson published several works on 
mathematics and mathematical logic. In the twentieth century, ingenious ways for com- 
puting determinants of matrices arising from various combinatorial problems have been 
devised by American mathematician George Andrews. 


Let A = [aij] € Mnxn(K), where K is an associative and commutative unital 
F-algebra. For each 1 <i, j < n, we define the minor of the entry a;; of A to be 
|Ajj|, where Ajj E€ M(—1)x(n—1)(K) is the matrix obtained from A by erasing the 
ith row and the jth column. 


Example If A= 


Nw e 
Ww oo w 
Or 


sthen Ars = | and Aza = | 5 ai 
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Proposition 11.11 Let F be a field, let (K, ¢) be an associative and commu- 
tative unital F -algebra. If n is a positive integer, and A = [aij] € Mnxn(K), 
then |A| = Via) Var; e |A;;| for each <t <n. 


Proof In order to simplify our notation, let det(y1,..., yn) denote the determinant 
of the matrix the rows of which are y1, .. ., Yn. We will first prove the theorem for the 
case t = 1. That is to say, we must show that |A| equals Draai; e |A;jl. 
For each 1 < h <n, let vn € Miıxn(K) be the matrix [d1 ... dn] defined by 


gf! ifi=h, 
!— |O otherwise. 


Then the ith row of A can be written as w; = Dj aijvj and so 


n 
|A| = det(w1, ..., Wn) = a4( Soa. W2, +, m) 


j=l 


n 
= Xaj e det(v;, W2,..+, Wn). 
j=! 
Thus we will prove the desired result if we can show that 
det(vj, w2,.--, Wn) = (~D) 'Ħ A1] 
for each 1 < j <n. Denote the matrix the rows of which are vj, w2,..., Wn by 
B = [bin], where 
1 ifi=landh=j, 
bin = 4 9 ifi=landhFj, 
dih ifi >l. 
For 1 < j <n, set Gij = {7 € Sa |7 (1) = j}. 
Suppose that j = 1. Then, in particular, there is a bijective correspondence be- 
tween G1, and the set of all permutations of {2, ...,n} which does not affect the 


signum of the permutation since if 2 € G1; then 1 does not appear in any inversion 
of z. Since bıı = 1 and bj, = 0 if h > 1, we thus have 


|B] = $` sgn(r)bi aa) @ +++ 0 bn ai) 


TESy 


= > sgn(zt ) bi 71) 02110 bn x(n) 


mEG 


= D sgn( )b2,0(2) è- -0 bnan) = lAl, 


mwEG 1 


232 11 Determinants 


and so we have shown, as desired, that |B| = (—1)!*!|Ay;|. If j > 1 put column j 
of B in the position of the first column and shift columns 1 to j — 1 of B to the right 
by one column position. This involves j — 1 column interchanges, and we have 


1 0 s 0 
„1 (427 a21 «++ An ; 
det(vj, w2,..., Wn) = (—1)77! = (—1)f*Ħ}]Ai jl. 
Anj Anli «+++ Amn 


Now assume that ż > 1. Again, we can interchange the tth row with the first row 
by t — 1 exchanges with the row above, and we get |A| = (—1)'~!|C|, where C is a 
matrix satisfying |C1;| = |Az;| for each 1 < j <n. Therefore, 


n n 
|A| = CDC = (HI! A Dc e Cil = Caj o |A] 


j=l j=l 


as desired. 
1730 
4 0 1 3 
Example For A = 02 4 9| wesee that 
3 15 1 
0 1 3 4 1 3 4 0 3 4 0 1 
|AJ=1)2 4 O0/-7]0 4 0O)/+3/0 2 O;}-OjO0 2 4 
15 1 3 5 1 3 1 1 3 1 5 
= 16+ 140 — 30 + 0 = 126 
and 
730 1 3 0 1 7 0 1 7 3 
|AJ=0)0 1 3|—2ļ|4 1 3|+4|4 0 3/-0/4 O 1 
1 5 1 3 5 1 3 1 1 3 1 5 


=0— 2+ 128-0 = 126. 


Even this method of computing determinants is not easy, however, unless there is 
a row (or column) of the matrix a significant number of the entries in which are equal 
to 0. To see the computational overhead of computing the determinant of a general 
n x n matrix using minors, let us denote the number of arithmetic operations needed 
to do so by pn. Clearly pı = 1 and p2 = 3. Suppose that we have already found 
Pn—1. Then, by Proposition 11.11, we see that in order to compute the determinant of 
an n x n matrix we have to compute the determinants of n matrices of size (n — 1) x 
(n — 1) and then perform n multiplications and n — 1 additions/subtractions. That is 
to say, we obtain the recursive formula 


Pn =Npn-1 +n + (n—1)=npy-1+2n—1, 
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when n > 2. Setting t, = 1 Pns we see that 


2 1 


hi —t,.-1 = ———_- 
no mnl a n! 
and so 


tn = [tn = tn—-1] F [tn-1 = tn-2] per [t3 = h] +h 


f 2 1 ese 
algem ala a 


and thus we see that p, = n4 + Gow +++ t + 1] — 2. But from calculus we 


n! 
know that e, the base of the natural logarithms, has an expansion of the form 


where 0 < c < 1, and so p, =n![e — wen! — 2, If n > 2, we see that 


e e e 


0< < <-< 
n+1 n+17 3 


1 


and so we conclude that en! — 3 < p, < en! — 2. Since p, is a positive integer, we 
see that p, = Len!| — 2, where |r] denotes the largest whole number less than or 
equal to r, for any real number r. In particular, we see that p, grows even faster than 
exponentially, as a function of n, which is very rapid growth indeed. For example, 
Pio = 9,864,094 and p15 = 3,554,625,081,047. 

Recently, sophisticated numerical techniques have been developed to compute 
the determinants of matrices with entries from a finite field. 

In special cases, it is also possible to find bounds on the value of the determinant 
of a matrix, without actually computing it. For example, we will see below that if 
A € My xn(R) and if g is a positive real number greater than or equal to the absolute 
value of each of the entries of A, then the absolute value of |A| is at most g”~/n”. In 
1980, American mathematicians Charles R. Johnson and Morris Newman proved a 
surprising bound. Let A = [aij] E€ Mnxn(R). For each 1 <i < n, let b; be the sum 
of all positive entries in the ith row of A and let c; be the sum of all negative entries 
in the ith row of A (the sum of an empty list is taken to be 0). The absolute value of 
|A| is then at most []_, max{b;, —ci} — []/_, min{b;, —c;}. 
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Proposition 11.12 Let n be a positive integer, let F be a field, and let 
(K, e) be an associative and commutative unital F-algebra. Let A = [aij] € 
Mnxn(K) be a matrix which can be represented in block form as 


By O eas O 
Boy, Bo ... O 
Bmıi Bm2 ... Bum 


where m > 1 and each of the Bnn is square. Then |A| = Į [}—; | Baal. 


Proof Let us first consider the case m = 2, and assume that B11 € M; x;(K) for 
some t < n. We will proceed by induction on t. If t = 1, then, by Proposition 11.11, 
|A| = a11|Bo2| = |B11| © | Bo2|, and we are done. Now assume that ¢t > 1 and that 


the result has been established for all matrices of the form Bi ʻO | where 


Bo, Ba 
Bıı E€ Ma-1)xa-1)(K). Let C; be the matrix obtained from B12 by deleting the 
jth column. Then, by Proposition 11.11 and the induction hypothesis, 


t 
= Oa o Biy 0 
aa D aije C) Bn 


t 
= ART e (|(BiDij| © B221) 
j=l 


t 
= (Zentas . C) e | Bo2| = | B11] © | B22], 


j=l 


which establishes this case. 
Now assume, inductively, that the result has been established for m and consider 
a matrix A € My ,(K) which can be written in block form as 


By O pen O 
B21 Bz ace O 
Bm+1,1 Bm+1,2 tee Bm+1,m+1 
Bii O ee O 
Bo Bo... O . i 
If we set C = . . p . then, by the case m = 2 and the induction 
Bint Bm2 tee Bmm 


hypothesis, |A| = |C] è |Bm+1,m+1| = ŻE] Bnl. 
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Note that, as an immediate consequence of Proposition 11.4, we see that if all of 
the matrices Byn are of the same size, then |A| = | B11 --- Bmml- 

Let A = [aij] E€ Mnxn(K) for some associative and commutative unital F- 
algebra (K,e). We define the adjoint of A to be the matrix adj(A) = [bij] € 
Manxn(K), where bj; = (—1)'t/|Aj;| for all 1 <i, j <n. 


1 0 3 5 
E eraz] 131 Ma4x4(R) th 
xample = 4212 E€ May4(R) then 
1 12 5 
—20 9 —17 25 
MNS 50 —18 —16 —40 


—40 -18 —16 50 
10 9 13 —35 


Proposition 11.13 Let F be a field and let n be a positive integer. If 
A = [aij] E€ Mnxn(F) then A[adj(A)] = |A|J. In particular, if the matrix A is 
nonsingular then A7! = |A|! adj(A). 


Proof Suppose that adj(A) = [b;;]. Then Aladj(A)] = [c;j], where cjj = 
Sii aikbkj = hy (Ds aig] A jel. If i = j, then, by Proposition 11.11, this 
is just |A|. If i Æ j, this is just |A’|, where A’ is a matrix identical to A in all of 
its rows except the ith row, and that is equal to the jth row of A. Thus the matrix 
A’ has two identical rows, and so by Proposition 11.3, |A’| is equal to 0. Hence 
A[adj(A)] = |A|/, from which we also immediately deduce the second statement 
since if A is nonsingular then |A| 4 0. 


In particular, we note that if A is nonsingular then so is adj(A). 


Proposition 11.14 Let F be a field, let (K, e) be an associative and com- 
mutative unital F-algebra, and let n be a positive integer. If A = [aij] € 
Mnxn(K) is an upper-triangular matrix then |A| = T Gii. 


Proof We can prove this by induction on n. For the case n = 1, it is immediate. 
Assume therefore that we have already established it for all matrices in Mn xn(K). 
Then, by Proposition 11.11, |A| = |A| = X5 (CD 'Haji e |A j| = an o |Al. 
But, by the induction hypothesis, |A11| = [hs aii, and we are done. 
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By Proposition 11.14, we see that in general, from a computational point of view, 
it is much faster to first perform elementary operations on a matrix to reduce it 
to upper-triangular form, and then calculate the determinant (making use of the 
fact that, from the definition of a determinant function and from Proposition 11.4, 
we easily know the determinants of the elementary matrices), than to calculate the 
determinant directly. When working in associative and commutative unital algebras 
over a field, or when working with matrices of integers, this presents somewhat 
of a problem since it is not always possible to divide by nonzero scalars in such 
contexts. However, various variants on Gaussian elimination which do not involve 
division have been developed to overcome this. 


© The Daily Northwestern. 


One of the major researchers instrumental in the development of such 
methods was the twentieth-century Swiss/American computer scien- 
tist Erwin Bareiss. 


Combining Propositions 11.4 and 11.14, we see that if A € Mnxn(F) can be 
written in the form LU, where L is a lower-triangular matrix and U is an upper- 
triangular matrix, then |A| is the product of the diagonal elements of L and the 
diagonal elements of U. 


Proposition 11.15 (Cramer’s Theorem) Let F be a field and let n be a 
positive integer. If A = [aij] E€ Mnxn(F) is a nonsingular matrix and if w = 


by 
: | € F”, then the system of linear equations AX = w has the unique 
bn 
dı 
solution v= | : | in which, for each 1 < i <n, we have di = JAT! [Aol 
dn 


where A(;) is the matrix formed from A by replacing the ith column of A by w. 


Proof If Av = w then |A|v = (AJAT IAV = adj(A)Av = adj(A)w and so for each 
1 <i <n, we have |Ald; = Xj- CDH b; |A j;|. But the expression on the right- 
hand side of this equation is just, by Proposition 11.11, |A()|, developed by minors 
on the ith column. 
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Gabriel Cramer was an _ eighteenth-century 
Swiss mathematician and friend of Johann 
Bernoulli (one of the formulators of calculus) who 
was among the first to study determinants and 
their use in solving systems of linear equations. 
Cramer’s rule was also described independently 
by the eighteenth-century Scottish mathematician 
Colin Maclaurin. 


Cramer’s theorem, published in 1750, was the first systematic method for solving 
a system of linear equations, though special cases of it were known to Leibnitz 
75 years earlier. While it is elegant mathematically, it is clearly not computationally 
feasible, even when n is only moderately large, as was immediately realized by 
mathematicians of the time. Indeed, solving a system of linear equations AX = w 
by Cramer’s method, where A is a nonsingular n x n matrix over a field F, requires 
znt = an? = sn? + an additions and bnt + sn + $n? + én — | multiplications, 
which is considerably worse than the methods we have previously studied, for which 


the number of arithmetic operations necessary grows as n°, rather than as n4. 


2 
Example Consider the system of linear equations AX = | 1 |, where A = 
4 
1 -1 1 
1 2 0 |. Then |A| = —5 and 
1 0 -l 
2 -1 1 1 2 1 
IAg@l= ]1 2 O0O;=-13, |Ag|=)1 1 0|=4, and 
4 0 -I 1 4 -l 
1 —1 2 
|Ag| = ]1 2 ‘i 7, 
1 0 4 
13 
As a consequence, we see that the unique solution to the equation is i —4 
=7 


We note that if A = [aij] € Mnxn(F) then the polynomial 


> sgn(w) Xx(1)X2(2) °° Xam) € FLX, ..-, Xn] 


TESy 


is flat and of degree n. This allows us to make an interesting use of Proposition 4.5. 
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Proposition 11.16 Let F be a field of characteristic other than 2, let A = 
[aij] E€ Mnxn(F) be an arbitrary matrix, and let C = [cij] € Mnxn(F) be 
a diagonal matrix with nonzero entries on the diagonal. Then there exists a 
diagonal matrix E = [eij] € Mn xn(F) with diagonal entries +1 such that 
EC + A is nonsingular. 


Proof Let X1, ..., Xn be indeterminates over F and let D = [dij] E€ Maxn(F[X1, 
...,Xn]) be the diagonal matrix with di; = X; for 1 <i <n. Then |DC + A| isa 
flat polynomial in F[X,,..., Xn] of degree n, and so the result follows immediately 
from Proposition 4.5. 


Example If A = [aij] € Mnxn(R) and if e > 0 then, by Proposition 11.16, it is 
possible to “tweak” the diagonal of A to obtain a nonsingular matrix [a; jl where 


ao aij Łe ifi=j, 
ij otherwise. 


The sum which appears in the definition of the determinant shows up in other 
contexts related to matrix algebras. An associative algebra (K, e) over a field F sat- 
isfies the standard identity of degree n if and only if Les, sgn(T)ar(1) © ar(2) © 

++ @ dz(n) = O for any list a1, ...,an of elements of K. Thus, for example, the 
standard identity of degree 2 is aj e a2 — a2 ea, = 0. The algebra K satisfies this 
identity precisely when it is commutative. The Amitsur—Levitzki Theorem states that 
for any field F and any positive integer n, the F-algebra My xn(F) satisfies the 
standard identity of degree k for each k > 2n. There are several proofs of this 
result, all beyond the scope of this book. Some of these are based on a gener- 
alization of the Cayley—Hamilton Theorem, which we shall see in the following 
chapter. 


© Alexander Levitzki (Levitzki). 

Yaakov Levitzki and his student Shimshon 
Amitsur were twentieth-century Israeli alge- 
braists. 


We end this chapter by showing how an important construction in analysis 
can be considered in terms of determinants of matrices over the R-algebra R[X]. 
Let cg, c1, ... be real numbers and let us consider the analytic function f : x => 
en c;x', which converges for all x in some subset U of R. We know that U 4 Ø 
since surely 0 € U. Given positive integers k and n, we want to find polynomials 
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D(X), q(X) € R[X] of degrees at most k and n, respectively, such that the function 
x +> p(x)q(x)7! — f(x) also converges for all x € U and is representable there by 
a power series of the form x +> 79°, dix**"*". If we find such p and q, then the 
function x +> p(x)q(x)7! is called the Padé approximant to f of type k/n. Padé 
approximants are very important tools in differential equations and in approximation 
theory. Hermite made use of Padé approximants in his proof of the transcendence 
of e. 


Henri Padé was a nineteenth-century French en- 
gineer who developed these approximants in the 
course of his work. Interest in them intensified 
in the early twentieth century when the French 
mathematician Emile Borel made extensive use 
of them in his work on analysis. 


x2+4+4x+6 


Example If f : x 1> e = 772 ao! then the function gı : x > =g; is a Padé 
2 . 7 
approximant to f of type 2/1 and the function g2 : x => x torte is a Padé approx- 


imant to f of type 2/2. 


If we are given an analytic f as above, how do we calculate Padé approxi- 
mants to it? One way is by using determinants. First of all, define c_; = 0 for 
all positive integers i. Then, given positive integers k and n, define the matrices 
Pxin(X), Qkjn(X) E€ Mn+) (nt) (RIX) by setting: 


Ck-n+1 Ck—n+2 see Ck+1 
Ck—n+2 Ck—n+3 tee Ck+2 
Pk/n (x)= 
Ck Ck+1 see Ck+n 
k-n „ ynti yok-nt+l .. ynti-1 k yi 
70 ciX 7-0. iX ee jpg EX 
and 
Ck—n+1 Ck-n+2 +++ Ck+l 
Ck—n+2 Ck-n+3 +++ Ck+2 
Ok/n(X) = : 
Ck Ck+1 ses Cktn 
x” Kim 1 


Then the polynomials p(X) = | Pk/n(X)| and q (X) = |Qk/n(X)| are of the de- 
sired size, and our approximant is given by x œ> p(x)q(x)7!. 
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Exercise 637 


sin(a) cos(a) cos(a) sin(a) 
Calculate | inb) cos(b) sin(b) cos(b) 
Exercise 638 
1 i l+i 
Calculate | —i 1 0 eC 
l—i 1 
Exercise 639 
a—6 0 0 —8 
5 a—4 0 12 
Calculate L] 3 a -6 for any a E R. 
1 
0 = 1 1 
Exercise 640 
Find the image of the function f from R to itself defined by 
1 0 -t 
f:tœļ]|1 1 —1l. 
t 0 -1 
Exercise 641 
For real numbers a, b, c, and d, show that 
a (+1) (a+2? @+3) 
b +1? (+2) +I) 
ce (c+1)? (+2) (+3) |” 
d? @d+i1P @+2)? @+3) 


Exercise 642 


11 
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for real numbers a and b. 


Let be a positive integer and let c be a fixed real number. Calculate the deter- 


minant of the matrix A = [aij] E€ Mnxn(R) defined by 


Exercise 643 


For a, b ER, calculate 


c ifi<j, 

aj= ji ifi=j, 

0 ifi>j. 

a b a+b 
b a+b a 
a+b a b 
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Exercise 644 
Let F = GF(2). Does there exist a matrix A € M2 x2(F) other than 7 satisfying 
the condition that |A| = |AT A| = 1? 


Exercise 645 
If n is a positive integer, we define the nth Hankel matrix Hy, E€ My xn(R) to be 
the matrix [q;;] satisfying 


= fo ifi+j—l>n, 
I" |i+j—1 otherwise. 


Calculate | H,,|. 


The nineteenth-century German mathematician Hermann Hankel 
was among the first to recognize and popularize the work of Grass- 
mann. 


Exercise 646 

Let p(X) = aọ +a, X + aX? and q(X) = bo +b, X + b2 X? be polynomials in 

C[X]. Show that there exists a complex number c satisfying p(c) = q(c) = 0 if 
ao a, a 0 


„|O ao a a| 
and only if bo bi b 0 =0. 
0 bo bi b 


Exercise 647 
Find the set of all pairs (a, b) of real numbers such that 


a+1 3a b+3a b+1 
2b b+1 2-b 1 

a+2 0 1 a+3 
b-1 1 a+2 a+b 


=0. 


Exercise 648 
0 (a-b? (a-o)? 
For a, b, c € R, show that | (b — a)? 0 (b— 0)? >0. 
(c-a)? (c-b)? 0 
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Exercise 649 
Let n be a positive even integer and let c,d € Q. Let A = [aij] € Mnxn(Q) be 
the matrix defined by 


c ifi=j, 
aj=\d ifi+j=n+l1, 
0 otherwise. 


Calculate | A]. 


Exercise 650 
Letn be a positive integer and let A = [aij] E€ Mnxn(Q) be the tridiagonal matrix 
defined by 


sl oifti-gl<l, 
lij =) 0 otherwise. 


Show that 


—1 ifn=3k, 
IAl=41 ifn=3k+1, 
0 ifn=3k+2 


for some nonnegative integer k. 


Exercise 651 
a+b c c 
Find a,b,c € Z for which | a b+c a | is divisible by 8. 
b b a+c 


Exercise 652 

Let n be a positive integer and let A € Mnxn(Q) be a nonsingular matrix satis- 
fying the condition that all of the entries of A and of A7! are integers. Show that 
|A|=+1. 


Exercise 653 
—2a a+b a+c 
For elements a, b, and c of a field F, calculate |a +b —2b b+c). 
a+c b+c —2c 


Exercise 654 
We know that the integers 23028, 31882, 86469, 6327, and 61902 are all divisible 


23 02 8 
3 1 8 8 2 
by 19. Show that}8 6 4 6 9/| is also divisible by 19. 
063 2 7 
6 19 0 2 
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Exercise 655 
Let q € Q. Show that there are infinitely-many matrices in M3,3(Q) of the 


2 2 3 
form | 34 +2 4q+2 5q+3 |, where a< b <c, the determinant of which 
a b c 


equals q. 


Exercise 656 
Let n be a positive integer and let F be a field. Let A € Mnxn(F) be a non- 


singular matrix which can be written in block form A = An An , where 
An An 
; ; B B 
An E€ Mxxx(F) for some integer k < n. Write AT! as D l2 , where 
Ba Bz 


By E€ Mgxk(F). Show that |A11| = |A| - | B22]. 


Exercise 657 


Find all real numbers a for which 


l 
= oO Ne 
NR We 


=1 


Exercise 658 

Let aj, a2, ... be a sequence of real numbers. For each positive integer n, define 
the nth continuant cn of the sequence to be the determinant of the tridiagonal 
matrix An = [aij] € Mnxx(R) given by 


ai ifi=j, 
| -1 ifi=j-1, 
Tt: ifi=j+1, 


0 otherwise. 


Show that cn = anCn—1 + Cn—2 for all n > 2. 


Exercise 659 
Let n > 1 be an integer, let d be a real number, and let A = [aij] € Mnxn(R) be 
the matrix defined as follows: 


0 ifi=j, 


aij=\ 1 ifi>landj=lori=landj>1, 
d otherwise. 


Show that |A| = (—1)”7! (n — 1)d"~?. 
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Exercise 660 
Let bj,...,b, be nonzero real numbers and let A = [aij] E€ Mnxn(R) be the 
matrix defined as follows: 


_ jl+b; ifi=j, 
1 otherwise. 


Calculate | A]. 


Exercise 661 
Let A = [aij] E€ M4x4(Q) be a matrix each entry of which is either —2 or 3. 
Show that |A| is an integer multiple of 125. 


Exercise 662 

Let a, b, c, and d be real numbers not all of which are equal to 0. Show that the 
a b c d 

matrix a a e € M4x4(R) is nonsingular. 
c —d —a b 4x4 g 

d c —b —a 


Exercise 663 
Does there exist a rational number a satisfying the condition that the matrix 
l a 0 
a 1 1 | € M3x3(Q) is nonsingular? 
-l a -l 


Exercise 664 
Find all matrices J 4 A E€ M2x2(R) satisfying A =]. 


Exercise 665 
Find all triples (a, b, c) of real numbers satisfying the condition 


laa 
1 b Bl=(b—c)(c—a\(a—b)(at+b+to). 
(6 c? 


Exercise 666 

Let n be a positive integer, let A = [aij] E€ Mnxn(C) and let B = [bj] € 
Mnxn(C) be defined by b;; = aj; for each 1 < i, j < n. Show that |AB| is a 
nonnegative real number. 


Exercise 667 


1 log, a 


Calculate log. b 1 


for given positive real numbers a and b. 


a 
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Exercise 668 


Let F be a field. Calculate 3 for anya e€ F. 


2a 3a? 4 
4a? 3a? 2a 1l 


Exercise 669 

cos(a) sin(a) cos(a) sin(a) 
cos(2a) sin(2a) 2cos(2a) 2sin(2a) 
cos(3a) sin(3a) 3cos(3a) 3sin(3a) 
cos(4a) sin(4a) 4cos(4a) 4sin(4a) 


Calculate foraeR. 


Exercise 670 
Let n be a positive integer and let A = [aij] E€ Mnxn(R) be the matrix defined 
by 


aes 0 ifi=J, 
“i =) 1 otherwise. 


Calculate | A]. 


Exercise 671 


3 -1 1 
Let A=] 0 2 4 | e M3x3(R). Calculate adj(A). 
1 -1 1 
Exercise 672 
1 0 
Let F = GF(2) andletA=|]0O 1 1 | € M3x3(F). Calculate adj(A). 
1 1 1 


Exercise 673 
Let F be a field, let n be a positive integer, and let A, BE My xn(F). Is it nec- 
essarily true that adj(A B) = adj(A) adj(B)? 


Exercise 674 
Let F be a field, let n be a positive integer, and let A € My xn(F). Is it necessarily 
true that adj(AT) = adj(A)?? 


Exercise 675 
Let F be a field, let n be a positive integer, and let the matrices A, B € Mnxn(F) 
be nonsingular. Show that adj(B~!AB) = B~! adj(A)B. 


Exercise 676 
Let F be a field, let n be a positive integer, and let A, B E€ Myxn(F) be matrices 
satisfying B # O and AB = O. Show that |A| = 0. 
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Exercise 677 


1 2 3 
Let A=|1 3 4 | €M3,3(R). Use the adjoint of A to calculate A~!. 
1 4 3 


Exercise 678 

Let F be a field, let n be a positive integer, and let A = [ajj] € Mnxn(F). Let 
B = [bij] € Maxn(F) defined by bj; = (—1)' "aij for all 1 <i, j <n. Show 
that |A| = |B|. 


Exercise 679 

Let F be a field, let n be a positive integer, and let A = [aij] E€ Mnxn(F). Let 
B = [bij] € Mnxn(F) defined by bj; = (—1)'t/*!a;; for all 1 <i, j < n. Show 
that (—1)”|A| = |B|. 


Exercise 680 
Let n be a positive integer and let m € S,. Let A € Myx, (Q) be the permutation 
matrix defined by x. Calculate | A]. 


Exercise 681 
Is the set of all permutation matrices in My xn(Q) closed under multiplication? 
Is the inverse of a permutation matrix a permutation matrix? 


Exercise 682 
Let A = [aij] E€ M3x3(R) be a matrix in which aj2 Æ 0 for all 1 <i < 3. Denote 
the minor of a;; for all 1 <i, j < n by Aij. Show that 


1 


a22 


1 


432 


A2 A23 
A31 A32 


Ai Aj3 
A31 A33 


Ail A33 


1 
A| = — 
pal A2 An 


a12 


Exercise 683 
Let F be a field, let n be a positive integer, and let A = [ajj] E€ Muxn(F) be 
nonsingular. Show that adj(adj(A)) = |A|?-2A. 


Exercise 684 

Let a and b be real numbers and let n be an integer greater than 2. Let D = [dij] € 
Mnxn(R) be the matrix defined by dj; = sin(ia + jb) for all 1 <i, j <n. Show 
that |D| = 0. 


Exercise 685 
Let F be a field and let a, b, c,d, e, f, g € F. Show that 
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Exercise 686 
Let F be a field, let n be a positive integer, and let A = [ajj] E€ Mnxn(F). Let 
B = [bij] € Mnxn(F) be the matrix defined by 


b= dij + dj, j+1 if j <n, 
1J din otherwise. 


Show that |B] =| A]. 


Exercise 687 
Let k and n be integers greater than 1. Let F be a field and let A = [a;;] be a 
matrix in Mx x»(F), the upper row of which contains at least one nonzero entry. 


For each 2 <i < k and each 2 < j < n, let dij = be . Show that the rank 
il ij 
dö «s dn 
of the matrix D = n e € Mck—-1)x(n—1) (F) is r — 1, where r is 
dk2 ... dkn 
the rank of A. 
Exercise 688 


Let F be a field, let a 4 b be elements of F, and let A, B E€ M2x2(F) be matrices 
satisfying the condition that |A + AB| € {a,b} for h = 1,2,3,4, 5. Show that 
|A+9B| € {a,b}. 


Exercise 689 


bc 0 
Let F be a field and let a, b, c € F. Make use of the matrix | a 0 c | inorder 
0 a b 


b? +c? ab ac 
to calculate the determinant of the matrix ab a +e? be 
ac bc a? +b? 


Exercise 690 

Let A € Myyn(C) be a nonsingular matrix, which we will write in the form 
B+iC, where B,C € Mnxn(R). Show that there is a real number d such that 
the matrix B + dC € Mnxn(R) is nonsingular. 


Exercise 691 

Let F be a field and let n be a positive integer. Let A € Mnxn(F) be a matrix 
having the property that the sum of all even-numbered columns (considered as 
vectors in F”) of A equals the sum of all odd-numbered columns of A. What 
is |A|? 
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Exercise 692 

Let V = R? and let f : V? —> R be the function defined as follows: if vj = 
: U1 ay by 1 

| fori = 1,2,3, then f : | v2 |> Jaz b2 1|. Show that f(v4, v2, v3) = 
i U3 a3 b3 1 

f (v4, v2, v3) + f (v1, v4, v3) + f (V1, v2, v4) for all v1, v2, v3, v4 E€ V. 


Exercise 693 

Let F be a field and let n be a positive integer. Let D = [djj] E€ Mnxn(F) be 
the matrix defined by dj; = 1 for all 1 < i, j < n. Show that for any matrix 
A € Mpnxn(F) precisely one of the following conditions holds: (1) There is a 
unique scalar a € F such that A + aD is singular; (2) A + aD is singular for all 
scalars a € F; (3) A +aD is nonsingular for all scalars a € F. 


Exercise 694 
Let A, B,C, D € M2x2(R) and let M be the matrix - D | e M4x4(R). If 


all of the “formal determinants” AD — BC, AD — CB, DA— BC,and DA—CB 
are nonsingular, is M necessarily nonsingular? 


Exercise 695 


Let A, B,C, D € M2x2(R) and let M be the matrix : a € M4x4(R). If 


M is a nonsingular matrix, is at least one of the “formal determinants” AD — BC, 
AD — CB, DA — BC, and DA — CB also nonsingular? 


Exercise 696 

Let n > | be an integer and let A = [aij] € Mnxn(Q) be a matrix satisfying the 
condition that each aj;; is either equal to 1 or to — 1. Show that |A| is an integer 
multiple of 2”~!. 


Exercise 697 
If a,b,c, d,e, f are nonzero elements of a field F, show that 


0 e BP e O ad be cf 
a’ 0 f? e| lad O cf be 
2 f 0 | |be cf 0 ad) 
c e d o cf be ad 0 


Exercise 698 

Let n be a positive integer and let c1,...,Cn be distinct real numbers tran- 
scendental over Q. For 1 < h < n, let p(X) = yo aj X' € Q[X] be a 
polynomial of degree h — 1. Let A = [p;(cj)] E€ Mnxn (R). Show that |A| = 
(a0--+@n—1) |]; <j (Ej — ci). 
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Exercise 699 

Let n be a positive integer and let cj,...,c, be distinct real numbers tran- 
scendental over Q. For each 1 <i, j < n, set dij = cj — gi and let A = 
[dij] E€ Mnxn(R). Show that |A| equals (c1---Cn)” [i<j — oc) — 
cic) [fi — 1). 


Exercise 700 
Let a, b, c, d be elements of a field F. Solve the equation 


No ea 
a no & 
Fa an 
a Fa Re 


Exercise 701 
Let F be a field. Does there exist a matrix A in M3,.3(F) satisfying the condition 
that the rank of adj(A) equals 2? 


Exercise 702 
Let a, b, and c be nonzero real numbers. Under which conditions does the equa- 
0 a—X b-X 
tion |—a — X 0 c — X | = 0 have more than one solution? 
—b—X -c-—X (0) 


Exercise 703 
Use determinants to show that there is no matrix A € M4x4(Q) satisfying the 


1 0 0 0 
02 00 

ve 4_ 
condition that A? = 0010 
00 0 1 


Exercise 704 
Let A = [aij] E€ Mnxn(R) be a matrix satisfying the condition |a;;| > ii |ai;| 
for all 1 < i < n. Such matrices are called strictly diagonally dominant. Show 
that |A| 40. 


Exercise 705 
Let F be a field and let a, b, c € F. Is it true that 


a bco —a b c 0 
ba 0 c|_| b -a 0 ch 
c 0 ab c 0 —a b| 
0 c ba 0 c b —a 
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Exercise 706 
Let F be a field and let n > 2 be an integer. Give an example of a matrix 
A € Mnxn(F) all of the entries in which are nonzero, satisfying adj(A) = O. 


Exercise 707 
Let F be a field and let n > 2 be an integer. Show that | adj(A)| = |A|"~! for all 
A = Mnxn (F). 


Exercise 708 
Is the function adj: M2x2 (R) > M2x2(R) epic? 


Exercise 709 


s 0t 
For real numbers s andt, let A(s,t)=| 1 1 1 | andlet B(s, t) = adj(A(s,t)). 
t O 1 


Find the set of all real numbers s satisfying the condition that | A(s, t)| # |B (s, £)| 
for allt eR. 


Exercise 710 

Let n be a positive integer and for all 1 < j < n, let mj be a positive inte- 
ger. Define the matrix A = [aij] E€ Mnxn(Q) by setting aij = oe) for all 
1 <i, j <n. Calculate |A|. 


Exercise 711 
Let a and b be distinct elements of a field F and let n be a positive integer. Let 
A(n) = [aij] € Mnxn(F) be the matrix defined by 


prem a ifi=j, 
“|b otherwise. 


Use induction on n to prove that |A (n)| = [a + (n — 1)b] (a — bT, 


Exercise 712 
Let n be a positive integer and pick integers 1 < h, k <n. Let f, g € RË be the 
functions defined by 


. lEnc| ifc#0, 
ren fi ifc=0 


and g : c > |Ehk;c|. Are these functions continuous? 


Exercise 713 

Let n be a positive integer and let A € Mnxn(Q) be a nonsingular matrix the 
entries of which are integers and the determinant of which is +1. Show that all 
of the entries of A~! are integers. 
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Exercise 714 
Let F be a field and let A € M2x2(F). Show that the matrix A? + |A|Z belongs 
to the subspace of M2x2(F) generated by {A}. 


Exercise 715 

Let A = [aij] E€ M3x3(Q) be a matrix all of the entries of which are nonnegative 
one-digit integers. Let d be a positive integer dividing the three-digit integers 
411412413, 421422423, and a31a32a33. Show that d divides | A]. 


Exercise 716 

Let n be a positive integer and let F be a field. Let A E€ My xn(F) be a matrix 
satisfying the condition that |A + B| = |A| for all B E€ Myxn(F). Show that 
A=0O. 


Exercise 717 

Let n be an odd positive integer let A € Mnxn(R). Show that there exists a 
diagonal matrix B the diagonal entries of which are +1 such that A + B is non- 
singular. 


Exercise 718 

Let n > 1 be an integer and let F be a field. Show that there exist subspaces W 
and Y of Mnxn(F) satisfying Mnxn(F) = W @ Y such that the restrictions of 
the determinant function ô„ to W and to Y are linear transformations. 


Exercise 719 

Let n > 1 be an integer and let B be the set of all of the nonsingular matrices in 
Mnxn(R) all of the entries of which are either 1 or 0. Show that in every matrix 
in B there are at least n — 1 entries which are equal to O and that there exists a 
matrix in B in which there are precisely n — 1 entries equal to 0. 


Exercise 720 
Let A be a matrix formed by permuting the rows or columns of an Hadamard 
matrix. Is A necessarily an Hadamard matrix? 


Exercise 721 

Let V be a vector space of finite dimension n over a field F and let {v1,..., Vn} 
be a given basis for V. Let U be the subset of V consisting of all vectors of 
the form yg = } 4—1 a'—!v;, for 04a € F. Show that any subset of U having n 
elements is a basis for V. 


Exercise 722 
Let F be a field and let n be an even positive integer. Let A € Mnxn(F) bea 


matrix which can be written in block form as [A;;], where Aj; = | n q| if 


—c; 0 
0 0 : 
otherwise. Calculate the Pfaffian of A. 


j= jand Ay =[ 5 0 
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Exercise 723 
Let F be a field and let n be an even positive integer. Let c € F and let 
A € Moxn(F) be a skew-symmetric matrix having Pfaffian d. What is the Pfaf- 
fian of cA? 


Exercise 724 
Letc= 5(1 + i4/3). Find the set of all real numbers a such that 


a 1 1 
1 c cleR 
1 e c 


Exercise 725 


For elements a, b, c,d of a field F, calculate the value of 


aa Fa 
a AaS 
DSa ana 
a Sa & 


Exercise 726 

Let F be a field and let n be a positive integer. Set V = Mnxn(F) and let 
A, B € V satisfy the condition that |AB| = 1. Then the function «œ : C œ> ACB 
is an endomorphism of V satisfying |C| = |æ (C)| for all C € V. Find an endo- 
morphism of V satisfying the same condition, which is not of this form. 


Exercise 727 

Let F be a field and let n be a positive integer. For A, B, C, D € Mnxn(F), show 
A B -C -D 

C D A BV 


that 


Exercise 728 
Let F be a field and let A, B, C, D E€ Myxn(F) for some positive integer n. If 


CD = DC and |[D| #0, show that | p|=/4D- BCI 


Exercise 729 
If F is a field and A = [ajj] € Mnxn(F) for some positive integer n, then we 
define the permanent of A to be 


> An (1),1 ® A7(2),2 @°** @ Az(n),n- 
TESy 


(i) Show that the permanent of A is the coefficient of X;---X, in the polyno- 
mial jai X1 +++) +4inXn) € F[X1,..., Xn]. 
(ii) If A is a permutation matrix, what is its permanent? 
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Exercise 730 
Does there exist a matrix A E€ M5x5(Q) the permanent of which equals 120? 


Exercise 731 
Let F be a field. For any matrix A = [a;;] € M2 2(F), let U (A) be the set of all 
matrices B € M2 x2(F) satisfying |A + B| = |A| + |B|. Is U(A) a subspace of 
M2x2(F)? 


Exercise 732 
Let F be a field and let A € M2,2(F). Find a necessary and sufficient condition 
for |Z + A| = 1 + |A| to hold. 


Exercise 733 
Let F be a field and let n be a positive integer. If A, B, C, DE Mnxn(F) with 


D nonsingular, show that ie p| =]|AD — BD“!C D]. 


Exercise 734 
Let a, b, c € Z. Find a positive integer n such that 


2c a+b+c a+b+c 
a+b+c na a+b+c 
a+b+c a+b+c 2b 


is divisible by abc. 
Exercise 735 


Let n be a positive integer and let A € Myxn(Q) be a matrix all entries of which 
are integers and satisfying |A| = +1. Show that all entries of A7! are integers. 


Exercise 736 
Find the Padé approximant to x b> e* of type 2/4. 


Exercise 737 
Let a, b, and c be elements of a field F. Find an element x of F such that 


l a a 1 -b-c be 
1 x b*/=J]1 -c-—a ca 
lc e 1 -a—b ab 


Eigenvalues and Eigenvectors 1 2 


One of the central problems in linear algebra is this: given a vector space V finitely 
generated over a field F, and given an endomorphism a of V, is there a way to 
select a basis B of V so that the matrix ®gz(a@) is as nice as possible? In this 
chapter, we will begin by defining some basic notions which will help us address 
this problem. 

Let V be a vector space over a field F and let a € End(V). A scalar c € F is 
an eigenvalue of a if and only if there exists a vector v Æ Oy satisfying a(v) = cv. 
Such a vector is called an eigenvector! of œ associated with the eigenvalue c. Thus 
we see that a nonzero vector v € V is an eigenvector of « if and only if the subspace 
Fv of V is invariant under a. Every eigenvector of œ is associated with a unique 
eigenvalue of a but any eigenvalue has, as a rule, many eigenvectors associated 
with it. The set of all eigenvalues of a is called the spectrum of a and is denoted 
by spec(a). Thus, c € spec(a) if and only if the endomorphism co; — a of V is not 
monic. 


Example If V is a vector space over R and if a € End(V) satisfies œ? = —o1, then 
spec(a) = Ø. To see this, note that if v is an eigenvector corresponding to an eigen- 
value c then —v = a?(v) = c’v and so (c? + 1)v = Oy, implying that ce=-l, 
which is impossible for a real number c. In particular, if œ € End(R7) is defined by 


Qa: | = E then spec(a) = Ø. 


'The terms “eigenvalue” and “eigenvector” are due to Hilbert. Eigenvalues and eigenvectors are 
sometimes called characteristic values and characteristic vectors, respectively, based on termi- 
nology used by Cauchy. Sylvester coined the term “latent values” since, as he put it, such scalars 
are “latent in a somewhat similar sense as vapor may be said to be latent in water or smoke in a 
tobacco-leaf”. 


J.S. Golan, The Linear Algebra a Beginning Graduate Student Ought to Know, 255 
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Example Leta € End (R?) be defined bya: | = Hi Then c € spec(a) if and 


only if there exists a vector | satisfying H = El Therefore, we see that 


spec(a) = {—1, 1}, where E is an eigenvector of œ associated with —1 and 


a|. : : i 
[5] is an eigenvector of œ associated with 1, for any 0#a € R. 


Example Let V be the vector space of all infinitely-differentiable functions from R 
to itself and let ô be the endomorphism of V which assigns to each such func- 
tion its derivative. Then a function f, which is not the 0-function, is an eigen- 
vector of ô if and only if there exists a scalar c € R such that ô(f) = cf. For 
any real number c, there is indeed such a function in V, namely the function 
x +» e®™. Thus spec(5) = R. The set of all eigenvectors of 5 associated with c is 
{ae | a # 0}. This fact has important applications in the theory of differential 
equations. 


The first use of eigenvalues to 
study differential equations is 
due to the French mathemati- 
cian Jean d’Alembert, one 
of the foremost researchers 
of the eighteenth century. Im- 
portant solutions of eigen- 
value problems for second- 
order differential equations were obtained in the nineteenth century by Swiss mathematician 
Charles-Frangois Sturm and French mathematician Joseph Liouville. 


Let a be an endomorphism of a vector space V of a field F having an eigen- 
value c. If £ € Aut(V) then c is also an eigenvalue of Bap-!. Indeed, if v is an 
eigenvector of œ associated with c then Bab! (Bw) = Ba(v) = B(cv) = cp (v) 
and f(v) Æ Oy since 6 is an automorphism. Therefore, 6(v) is an eigenvector of 
BaB-! associated with c. 

Similarly, let p(X) = $ ;—o biXİ € F[X]. If v € V is an eigenvector of œ associ- 
ated with an eigenvalue c, then v is also an eigenvector of p(œ) € End(V) associated 
with the eigenvalue p(c), since p(æ)v = $`} _o bia! (v) = X?_obicv = p(c)v. In 
particular, we see that, for any positive integer n, the vector v is an eigenvector of 
a” associated with the eigenvalue c”. 

Let V be a vector space over a field F and let œ be an endomorphism of V. 
A vector v € V is a fixed point of a if and only if œ (v) = v. It is clear that Oy is a 
fixed point of every endomorphism of V and a nonzero vector v is a fixed point of 
a if and only if 1 € spec(@) and v is an eigenvalue of œ associated with 1. 
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Proposition 12.1 Let V be a vector space over a field F and let œ be an 
endomorphism of V having an eigenvalue c. The subset W composed of Oy 
and all eigenvectors of a associated to c is a subspace of V. 


Proof If w,w’ € W anda € F then a(w + w) =a(w) + a(w’) = cw + cw = 
c(w + w’) and a (aw) = aa (w) = a(cw) = c(aw) and so w + w’, aw € W, proving 
that W is a subspace of V. 


Let V be a vector space over a field F and let a be an endomorphism of V 
having an eigenvalue c. The subset W composed of Oy and all eigenvectors of œ 
associated with c, which we know by Proposition 12.1 is a subspace of V, is called 
the eigenspace of a associated with c. In particular, if 1 is an eigenvalue of œ then 
the fixed space of æ is the eigenspace associated with 1. If 1  spec(œ) then the fixed 
space of a is taken to be {Ov}. 


a a 
Example Define a € End(R?) by a: | b |= | 0 |. Then 1 € spec(@) and the 
c c 
1 0 
eigenspace of œ associated with 1 (namely the fixed space of ~)isR } | 0 | ,| 0 
0 1 


Example Small errors in recording data may lead to considerable errors in the calcu- 
lation of eigenspaces, even if the eigenvalues are calculated correctly. For example, 
let a,b,c,e € R and let a and £ be the endomorphisms of R3 represented with 


a 0 0 a e e 
respect to some fixed basis by the matrices | 0 b OJ} and| 0 b e |, respec- 
0 0 c 0 0 c 
tively. Then spec(œ) = spec(£) = {a, b, c}. The eigenspaces of œ associated with a, 
1 0 0 
b,careR| 0|, R| 1 |, and RJ] 0 |. The eigenspaces of 6 associated with a, b, 
0 0 1 
1 e e(e+c—b) 
careR|}0],R| b-—aj,andR e(c—a) 
0 0 (c—a)(c —b) 


Example Let V = C(O, 1) and let a be the endomorphism of V defined by 


a(f): xh h cos(x[x — t]) f(t) dt for all f € V. To find the eigenvalues of a, 
recall the trigonometric identity 


cos(mx [x — t]) = cos(mx) cos(zt) + sin(zx) sin(zt). 
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Using this identity, we see that if f € V then 


1 1 
a(f)i:xh p cos(zt) f(t) ar| cos(mx) + p sin(zt) f(t) ar| sin(z x) 
0 0 


and so the image of œ is contained in the subspace W = R{g1, g2} of V, where 
gı : x |> cos(mx) and gz: x > sin(xx). It is easy to see that a(g1) = 581 and 
a(g2) = 5 22, so both of these functions are eigenvectors of œ associated with 
the eigenvalue 7 Moreover, {g1, g2} is linearly independent. Thus we see that 
spec(a) = {5} and the eigenspace associated with this sole eigenvalue is W. 


Proposition 12.2 Let V be a vector space finitely generated over a field F 
and let abe an endomorphism of V . Then the following conditions on a scalar 
c are equivalent: 

(1) c is an eigenvalue of a; 

(2) coy —a ¢ Aut(V); 

(3) If A= gg (a) for some basis B of V, then |cI — A|=0. 


Proof (1) (2): Condition (1) is satisfied if and only if there exists a nonzero vector 
v € V satisfying a(v) = cv, i.e., if and only if (coy — w)(v) = Oy. This is true if and 
only if ker(co, — a) Æ {Ov}. Since V is finitely generated, by Proposition 7.3, we 
know that this is true if and only if condition (2) holds. 

(2) < (3): This is a direct consequence of the fact that a matrix is nonsingular if 
and only if its determinant is nonzero. 


From Proposition 12.2, we see how to define eigenvalues of square matrices over 
a field: if F is a field and n is a positive integer, then c € F is an eigenvalue of a 
matrix A E€ Mnxn(F) if and only if |cJ — A| = 0, namely if and only if the matrix 
cI — A is singular. The set of all eigenvalues of A will be denoted by spec(A). In 
particular, we observe that a matrix A is nonsingular if and only if O ¢ spec(A). 


0 
A vector | : | Æ v € F” is an eigenvector of A associated with the eigenvalue c if 
0 
0 
and only if Av = cv. The subset of F” consisting of | : | and all eigenvectors of 
0 


A associated with c is a subspace of F” called the eigenspace associated with c. In 
the case that F equals R or C, the number o(A) = max{|c| | c € spec(A)} is called 
the spectral radius of the matrix A, and plays a very important part in the numerical 
analysis of matrices. Note that if F = C, then (A) is just the radius of the smallest 
circle in the complex plane, centered at the origin, containing spec(A). Moreover, 
since spec(A) consists precisely of the poles of the function z+ |zJ — A|~!, this 


12 Eigenvalues and Eigenvectors 259 


observation allows the use of powerful techniques of complex analysis in the study 
of the spectra of complex matrices. 

Calculating the spectra of matrices is a critical tool in many applications of math- 
ematics. Thus, for example, in statistics one learns that finding the spectrum of co- 
variance matrices is an integral part of several data analysis techniques. 


Example It is not necessarily true that 0(AB) = p(A)p(B) for square matrices A 
and B. For example, if A = f l and B = i in M2x2(R), then p(A) = 
0 = p(B), whereas p(AB) = 4. 

Given a matrix A € Mn xn (F), we note that |cJ — A| = |(cI — A)T | = |cI — AT | 


and so spec(A) = spec(A7). However, for each such common eigenvalue, the asso- 
ciated eigenvectors may be different. 


1 1 -2 
Example Let A= | —1 2 1 | € M3x3(R). Then spec(A) = {—1, 1, 2} and so 
0 1 — 
this is also spec(A7). 
1 
(1) The eigenspace of A associated with —1 is R | 0 | and the eigenspace of AT 
1 
1 
associated with — 1 is R 2) 
-7 
3 
(2) The eigenspace of A associated with 1 is R | 2 | and the eigenspace of AT 
1 
-1 
associated with 1 is R 0}; 
1 
1 
(3) The eigenspace of A associated with 2 is R | 3 | and the eigenspace of AT 
1 


associated with 2 is R 1 
1 


It is interesting to note the following. Let F be a field and let n be a positive 
integer. If v, w € F”, then vA w = vw! € Mayn(F) and vO w =v! w e F. Direct 
calculation then yields (v A w)v = (v © w)v, showing that v is an eigenvector of 
v A w associated with the eigenvalue v © w. 


Example Let n be a positive integer and let A = [auj] be an n x n Markov matrix, 
which we will consider as an element of Mn xn(C). We claim that o (A) < 1. Indeed, 
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by 
let c € spec(A) and let v= | : | € C” be an eigenvector associated with c. Let 
bn 


1 < h <n satisfy the condition that |b;| < |bp| for all 1 <i < n. Then Av = cv 
implies, in particular, that }7”"_; anjbj = cbn and so 


n n 
<J anjlbjl< (Sn) = |bn]. 
j=l j=l 


n 


X anjbj 


j=l 


[c| - |bn| = |ebn| = 


Hence |c| < 1, as claimed. 


Example Let n be a positive integer and let A € Mnxn(R) be a skew-symmetric 
matrix. We claim that spec(A) C {0}, with equality when n is odd. Indeed, let 
c € spec(A) and let v € R” be an eigenvector of A associated with c. Then 
—ATv = Av = cv and so —A! (Av) = —AT (cv) = c(—A’ v) = c’v. Therefore, 


bı 
—(Av © Av) = =v AT Av = vT v = c? (v O v). But if y= | : | is any vector 
bn 
0 
in R”, then y © y = )7j_, b? > 0, with equality if and only if y= | : |. Since 
0 


v is nonzero, we conclude that we must have c? = 0 and so c = 0. Therefore, 
spec(A) C {0}. If n is odd then, by the remark after Proposition 11.7, we know 
that A is singular and so 0 € spec(A), establishing equality. 


Example Let n be a positive integer and let A € Mnxn (C). If c is a nonzero eigen- 
value of A and if v € C” is an eigenvector associated with c then, by Propo- 
sition 11.13, we know that |A|v = adj(A)Av = c[adj(A)]v and so [adj(A)]v = 
c7!|A|v. Thus v is also an eigenvector of adj(A) associated with the eigenvalue 
co Al. 


If F is a field, if n is a positive integer, and if A € My y,(F) is a matrix 
having eigenvalue c, then |cJ — A| = 0 and so, by Proposition 11.13, we have 
(cI — A)adj(cI — A) = O, whence Al[adj(cI — A)] = c[adj(cJ — A)]. From this 
we conclude that each of the columns of adj(cJ — A) must belong to the eigenspace 
of A associated with c. 


0 1 0 
Example Let A = | 0 0 1| e M3 x3(R). Then one can calculate that 
4 -17 8 
1 —4 1 
spec(A) = {2 — V3, 2 + V3, 4}. Moreover, adj(4I — A)=| 4 —16 4| and 
16 —64 16 
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it is easy to check that the columns of this matrix are indeed eigenvectors of A 
associated with 4. 


Proposition 12.3 Jf V is a vector space finitely generated over a field F and 
if a, p € End(V) then spec(aB) = spec(Ba). 


Proof Let c € spec(a@B). If c = 0, this means that æf ¢ Aut(V). Therefore, either a 
or 6 is not an automorphism of V, and so Ba ¢ Aut(V) as well. Therefore, we can 
assume c Æ 0. Let v be an eigenvector of wf associated with c and let w = £ (v). 
Then a(w) = œf (v) = cv # Oy and so w Æ Oy. Moreover, Ba(w) = BaB(v) = 
(cv) = cB(v) = cw and so w is an eigenvector of Ba associated with c. Thus 
spec(@B) C spec(Ba). A similar argument shows the reverse inclusion, and so we 
have equality. 


In particular, as a consequence of Proposition 12.3, we see that if F is a field, if 
n is a positive integer, and if A, B E€ Myxn(F) then spec(AB) = spec(B A). 

As we noted at the beginning of the chapter, if we are given a vector space V 
finitely generated over a field F and an endomorphism a of V, we would like to 
find, to the extent possible, a basis B of V such that the matrix gpg (œ) is nice, in 
the sense that it is amenable to quick and accurate calculations. Let V be a vector 
space over a field F (not necessarily finitely generated) and let a € End(V). Then œ 
is diagonalizable if and only if there exists a basis B of V composed of eigenvectors 
of a. 


Example We have already seen that the set B of all functions in RÈ of the form 
xt e™, for some a € R, is linearly independent. Therefore, W = RB is a sub- 
space of RÈ which is not finitely generated, and B is a basis for W. Let œ be the 
endomorphism of W which assigns to each f € W its derivative. Since each element 
of B is an eigenvector of a, we see that «œ is diagonalizable. 


The following result characterizes the diagonalizable endomorphisms of finitely- 
generated vector spaces. 


Proposition 12.4 Let V be a vector space finitely generated over a field 
F and let a € End(V). Then the following conditions on a basis B = 
{U,,..-, Un} are equivalent: 

(1) v; is an eigenvector of a for each 1 <i <n; 

(2) ®gp(q@) is a diagonal matrix. 


Proof (1) => (2): By (1), we know that for each 1 <i < n there exists a scalar c; 
satisfying a(v;) = c;v; and so, by definition, g(a) is the diagonal matrix [a;;] 
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given by 


RE ci ifi=j, 
1J — )0 otherwise. 


(2) > (1): If gg (a) = [aij] is a diagonal matrix then for each 1 <i < n we 
have a(v;) = ajjvj and so v; is an eigenvector of w for each 1 <i <n. 


Let V be a vector space over a field F and let a € End(V). If B is a basis of 
V made up of eigenvectors of œ then, as we have seen above, the elements of B 
are also eigenvectors of p(œ) for any polynomial p(X) € F[X]. We need not stick 
to polynomials: suppose that each v € B is an eigenvector of œ associated with an 
eigenvalue c, of a. Given any function whatsoever f : spec(œ) — F, we can de- 
fine the endomorphism f(a) of V by setting f (œ) : reg avv > } peg av f (Cy)v 
and the elements of B are also eigenvectors of f(a). We note that if f and g are 
functions from spec(«) to F then f(a@)g(a) = g(a) f (œ). 

Now assume that V is finitely generated over F and that B = {v1,..., Vn} is a 
basis of V made up of eigenvectors of a € End(V). For each 1 <i < n, let c; be 
the eigenvalue of œ associated with v;. We have already seen that for each such 
i there exists a polynomial p;(X), namely the Lagrange interpolation polynomial, 
satisfying the condition that 

1 ifi=j, 
Pe) = lo E 
Thus, given a function f : spec(w) —> F, the polynomial p(X) = 7, f(c) pi (X) 
satisfies p(c;) = f (ci) for all 1 <i <n, and so p(a) = f(a). Thus, for finitely- 
generated vector spaces, the above generalization does not in fact contribute any- 
thing new; it is important, however, in the case of vector spaces which are not finitely 
generated. 

We now show that the size of the spectrum of an endomorphism of a finitely- 

generated vector space is limited. 


Proposition 12.5 Let V be a vector space over a field F and let a € End(V). 
If c1,..., cg are distinct eigenvalues of a and if vi is an eigenvector of a 
associated with ci for each 1 <i < k, then the set {vj,..., vg} is linearly 
independent. 


Proof Assume that the set {v1,..., vg} is linearly dependent. Since vı 4 Oy, we 
know that the set {v1} is linearly independent. Thus there exists an integer 1 < t < k 
such that the set {v1, ..., vs} is linearly independent but {v1, ..., v¢+1} is linearly de- 
pendent. In other words, there exist scalars a), ..., @;+1, not all of which are equal 
to 0, such that yas aivi = Oy and so Oy = e410 it} aivi) — Ytl aici. 
On the other hand, Oy = ays aivi) = yeti aja(v;) = an, ajcjv;. Therefore, 
0y = yti aiCivi — yit Ajet = y ai (Ci — c1+1)vi. But the set {v1, ..., Vr} 
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is linearly independent and so a; (ci — c;+1) = 0 for all 1 <i < t. Since, by assump- 
tion, c; Æ cr+1 for all 1 <i < t, we have a; = 0 for all 1 <i < t and hence a4; =0 
as well, which is a contradiction. Thus {v1, ..., vz} must be linearly independent. 


Thus we see that if F is a field and if A € My x»(F), then spec(A) can have at 
most n elements. In particular, if F has more than n elements, then there exists an 
0 0 
element c € F x spec(A), and so (c7 —A)v#]| : | forallu#]| : |. This implies 
0 0 
that cJ — A is nonsingular. 
From Proposition 12.5, we see that if œ is a an endomorphism of a vector space 
V over a field F having distinct eigenvalues c1, ..., Cg, and if W; is the eigenspace 
associated with c; for all 1 <i < t, then the collection {W),..., Wx} of subspaces 
of V is independent. Moreover, if V is finitely generated over F then the number of 
elements in spec(a@) is no greater than dim(V). 


Proposition 12.6 Let V be a vector space of finite dimension n over a field F. 
Then any endomorphism a of V having n distinct eigenvalues is diagonaliz- 
able. 


Proof This is a direct consequence of Proposition 12.4 and Proposition 12.5. 


Example Let a € End(R2) be defined by a: H = E 7 A Then a (| 1) = 

Bl and so [i is an eigenvector of œ associated with the eigenvalue 2. Also, 
1 4 1}. : : : : 

al 9] f= hes and so _1 | isan eigenvector of œ associated with the eigen- 


value 4. Thus B = Hil | -1]} is a basis for R? and gga) = k i 


Example Let « € End(R?) be defined by a: Bl = eal If H P [o] ad 


b 


b = 0 and c = 1. Thus spec(œ) = {1} and the eigenspace associated with this sole 


a (lD =¢ g then cb = b and a + b = ca, and this can happen only when 


eigenvalue is R i 


R? made up of eigenvectors of œ, and hence a is not diagonalizable. 


| Since this is not all of R?, we know that there is no basis of 


Note that the converse of Proposition 12.6 is false, as we easily see by taking 
a =o}. 
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From the above, we know that if A € Mnxn(R) then the matrix has at most n 
distinct eigenvalues. However, it may have many fewer than that. If we assume that 
the entries of this matrix were chosen independently and randomly from a standard 
normal distribution, how many distinct eigenvalues should we expect? American 
mathematicians Alan Edelman, Eric Kostlan, and Michael Shub have shown that if 
En denotes the mathematical expectancy for the number of eigenvalues of such a 


matrix in R, then limy,_,.5 Tien = qe . The situation over the complex numbers 
is quite different. Given a matrix A E€ Mpnxn(C) one can, with probability 1, pick 
a matrix B € Mnxn(C) as near to A as we wish, which has n distinct eigenvalues 
in C. 

If F is a field, if n is a positive integer, and if A € Myyn(F), then we can 
consider the matrix of polynomials XI — A € Myxn(F[X]). The determinant of 
this matrix, |X J — A|, is a polynomial in F[X] called the characteristic polynomial 
of A. Note that this polynomial is always monic and of degree n. 


1 -1 0 
Example The characteristic polynomial of | 2 1 5| e M3x3(R) is X? — 
4 2 1 
3X? — 5X +27. 
1 2 1 2 
SON : 0 1 2 3 : 
Example The characteristic polynomial of A = 3 | € Ma,4(R) is 
1 1 2 0 


X4— 3X? — 11X? — 25X — 15. If we sketch the graph of the polynomial function 
tr tf — 3t? — 1142 — 25t — 15, we see that it has real roots in the neighborhoods 
of —0.8 and 5.8. (More precisely, they are approximately equal to —0.8062070604 
and 5.7448832706.) These are the only real eigenvalues of the matrix A. 


1 1 1 1 
Pre ; 2 0 1 0 

Example Let F = GF(3). The characteristic polynomial of A = 0110 € 
1 1 1 0 


Max4(F) equals X44 X341H= (X+ 2)(X? +2X? +2X + 2) and so A has only 
one eigenvalue, namely 1. 


5 4 2 
Example The characteristic polynomial of A= |4 5 2| in M3x3(Q) is 
22 2 
(X — 10)(X — 1)? and so spec(A) = {1, 10}. The eigenspace of A associated with 10 
2 —1 —1 
is Q | 2 |, while the eigenspace of A associated with | is Q 1], 0 


1 0 2 
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1 1 
polynomial of A is p(X) = X? + X + 1 and, since p(0) = p(1) = 1 we see that 
spec(A) = Ø. In fact, it is possible to show that for every prime integer p there is a 
symmetric 2 x 2 matrix A over GF(p) satisfying spec(A) = Ø. Later, we will show 
that any symmetric matrix over R must have an eigenvalue. 


Example Let F = GF(2) and let A = E i € M2x2(F). The characteristic 


Example Let a be the endomorphism of C? represented with respect to the canon- 

: R : 1 : | € M2xx(C). The characteristic poly- 

nomial of A is (X — 1)? and so spec(A) = 1 The eigenspace associated with it is 
i 


C | 1 i which has dimension 1. Therefore, œ is not diagonalizable. 


ical basis by the matrix A = 


Proposition 12.7 Let F be a field and let n be a positive integer. If 
A € Myxn(F) has characteristic polynomial p(X) = pear 9s then 
|A| = (—1)"ao. 


Proof We note that ay = p(0) = |OI — A| = | — A| = (—1)"|A| and so |A| = 
(—1)"ao. 


The speed with which we can compute the characteristic polynomial of a ma- 
trix depends on the speed with which we can multiply two matrices. In 1985, Swiss 
computer scientist Walter Keller-Gehrig showed that if we can multiply two n x n 
matrices over a field F in an order of n° operations, then we can calculate the char- 
acteristic polynomial of an n x n matrix over F in an order of n° log(n) operations. 
In 2007, French mathematician Clément Pernet and German/Canadian computer 
scientist Arne Storjohann constructed a new algorithm with an expected cost on the 
order of n°, provided that the field F has at least 2n? elements. If one has the use of 
a computer with n> parallel processors, then much faster computation times can be 
obtained. 

Any monic polynomial in F[X] of positive degree is the characteristic polyno- 
mial of some square matrix over F. To see this, consider a polynomial p(X) = 
yop ai X', for n > 0. If p(X) is monic, define the companion matrix of p(X), 
denoted by comp(p) € Mnxn(F), to be the matrix [a;;] given by 


1 ifi=j+landj <n, 
aij = =j] if j =N, 
(0) otherwise. 


Otherwise, define comp(p) to be comp(a; lp). 
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© Mathematical Association of America 2011. All rights re- 
served (Macduffee). 

Companion matrices were first studied at the 
beginning of the twentieth century by German 
mathematician Alfred Loewy. The term was first 
introduced by the twentieth-century American 
mathematician Cyrus Macduffee. 


Proposition 12.8 Let F be a field and let n be a positive integer. If p(X) = 
yg ai X' € F[X] is monic, then p(X) is the characteristic polynomial of 
comp(p) € Mnxn(F). 


Proof We will proceed by induction on n. For n = 1, the result is immediate. If 


n =2andif p(X) = X? +a X + ao, then comp(p) = i E] and so the charac- 
a 
teristic polynomial of comp(p) is E X a = p(X) and we are done. Assume 
— 1 

now that n > 2 and the result has been established for n — 1. Then the characteris- 
X O ura ao 
-1 xX... al 

tic polynomial of comp(p) is . : . . By Proposition 11.11, 
0 ... -l X+anı 


n—1 


this equals X|comp(q)| + ag(—1)""!|Bl, where g(X) = Xio aj41X! and where 
B € Mn-1)x(n-1)(F) is an upper-triangular matrix with diagonal entries all equal 
to —1. Thus |B| = (—1)"! and, by the induction hypothesis, |comp(q)| = q (X). 
Thus the characteristic polynomial of comp(p) is Xq(X) + ag = p(X), as de- 
sired. 


Let F be a field and let n be a positive integer. Every nonsingular matrix 
P € Mhyxn(F) defines a function wp from My xn(F) to itself given by wp : A > 
P-'AP. In fact, wp € Aut(Mnxn(F)), where oa = wp-1. This is an automor- 
phism of F-algebras and, indeed, it can be shown that every automorphism of unital 
F-algebras in Aut(Mnxn(F)) is of this form. Therefore, the set of all automor- 
phisms of the form wp is a group of automorphisms of Mnxn(F) and so defines an 
equivalence relation ~ by setting A ~ B if and only if B = P~'AP. In this case, 
we Say that the matrices A and B are similar. From what we have already seen, two 
matrices in Mpnxn(F) are similar if and only if they represent the same endomor- 
phism of an n-dimensional vector space over F with respect to different bases. One 
of the problems before us is to decide, given two square matrices of the same size, 
if they are similar or not. 

Note that if a matrix A E€ Mnxn(F) is similar to O, then it must equal O. Indeed, 
if P-'AP = O then A=(PP7!)A(PP~!) = P(P~'AP)P-! = POP! =0. 
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Example In M3,.3(Q), the matrices 


20 10 10 80 130 100 
A=|10 0 10 and B= 10 10 10 
10 10 10 —50 —80 —60 
1 2 1 
are similar, since B = PT!AP, where P=|1 0 1 |. Thus we note that a sym- 
2 3 3 
metric matrix may be similar to a matrix which is not symmetric. 
1 0 0 1 1 0 
Example The matricesA = | —1 1 1|andB=ļ|0 1 0 |in M3x3(Q) are 
-1 0 2 0 0 2 


not similar since, were they similar, the matrices A — J and B — I would also be 
similar, and thus have the same rank. But it is easy to see that the rank of A — J 
equals 1, while the rank of B — I equals 2. 


Example If matrices A, B € Mnxn(F) are similar, it does not follow that they com- 


1 0 -1 1 0 0 
mute. For example, let A = 2 3 0 | €M3,3(R). Then P=/0 1 O 
-1 0 -2 O 1 1 
1 1 -1 
is nonsingular and so B = PAP ™! = |2 3 O | is similar to A. However, 
1 5 -2 


ABZBA. 


Example Let F be a field and, for each 1 < h < t, let Ay, be a square matrix over F, 

which is similar to a square matrix By, over F. That is to say, there exists a nonsin- 

gular square matrix P} such that By = Pp An P;' . Let A be the matrix in block form 
A O ... O 


O Ad... O 
. i . in which all blocks not on the diagonal are equal to O, and 
O O ... At 
B, O O 
O B ... O 
let B = . . S . |. Then A is similar to B, since B = PAP, where 
O O B; 
P 0 O 
O P 


P=| . : S . |. We will make us of this fact in the next chapter. 


O O.... P 


268 12 Eigenvalues and Eigenvectors 


Proposition 12.9 Let F be a field and let k < n be positive integers. 
Let A € Mnxn(F) be a matrix which can be written in block form as 


T Ani A12 , where Ay, € Mkxk(F) and A22 € Min=k)x(n-k) (F). Then 
22 


spec(A) = spec(A11) U spec(A22). 


Proof Let c € spec(A) and let v € F” be an eigenvector associated with c. Write 


v= [3 |, where vı € F* and v2 € F"~*. Then 
3 


A11v1 + A1202 = Ai Ar U1 =Ap— c= cv] 
A222 O An v2 cv |` 


0 
From this we see immediately that if v2 # | : | then c € spec(A22), while if 
0 
0 
v2 = | : | then c € spec(Aj1). Therefore, spec(A) is contained in spec(A11) U 
0 
spec(A?). 


Conversely, let c € spec(A11) and let vı € F* be an eigenvector associated 

f vi |_| Anv 
with e. Then | | = | O | 
that d € spec(A22) N spec(A11) and let v € F "Kk be an eigenvector associated 
with d. Since d ¢ spec(A11), we know that the matrix B = Aj; — dI E Mgxk(F) is 


nonsingular. Set vı = B~!Ay2(—v2). Then (A — dI) H = pa] 
0 


=c E , proving that c € spec(A). Now assume 


, showing that d € spec(A). Therefore, spec(A11) U spec(A22) € spec(A), 


0 
proving equality. 
1 1 5 6 
-1 1 7 3 
Example Let A= 0 2 11€ Max4(C). Then 
0 0 —4 3 


specca) = spee (| _j i ]) usree([ 3 31) 


=£nu{fssivisl]. 
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Proposition 12.10 Similar matrices in Myxn(F), where F is a field and 
where n is a positive integer, have identical characteristic polynomials. 


Proof If A, B € Myxn(F) satisfy B = P~!AP then 
IXI — B| = |XI — P~'AP|=|P7'(XI— A)P| 
=|P|~'|XI— A||P|=|XI — Al, 


as required. 


Example The converse of Proposition 12.10 is false. Indeed, the matrices | i | 


and | i i are not similar, despite the fact that both of them have the same char- 


acteristic polynomial, (X — 1). 


A generalization of Proposition 12.10 tells us that if P, Q € Maxn(F) are non- 
singular matrices satisfying |P Q| = 1, then the endomorphism œ of Maxn(EF) 
given by apg : At> PAQ satisfies the condition that A and æ(A) always have 
identical characteristic polynomials. The same goes for the linear transformation 
Beg: Awe PAT Q. Frobenius proved that any endomorphism of Mnxn(C) which 
preserves characteristic polynomials must be of one of these two forms. Note that 
endomorphisms of the form apg or pọ are in fact automorphisms of My xn (FP). 
They also satisfy the property that apg(A) is singular if and only if A is singular, 
and similarly Bp g(A) is singular if and only if A is singular. Indeed, Dieudonné has 
shown that, for any field F, an endomorphism of M,,,(F) satisfying this condition 
must be of one of these two forms. 


= ~—sC With kind permission of the Archives of the Mathematisches Forschungsinstitut Ober- 
wolfach. 

The twentieth-century French mathematician Jean Dieudonné was 
one of the founders of the influential group who wrote under the col- 
lective name of Nicholas Bourbaki. 


Example If A and B are square matrices over a field F, then we know that the 
matrices AB and BA are not necessarily equal. They are also not necessarily similar. 
For example, if A = : and B = A ; then AB = O + BA, and so AB 
and BA are not similar. Nonetheless, by Proposition 12.3, we see that spec(A B) = 


spec(B A). 
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Proposition 12.10 can be used to facilitate computation, as the following example 
shows. 


Example Letn be a positive integer, let F be a field, and let A = [ajj] E€ Mnxn(F) 
be a symmetric tridiagonal matrix. That is to say, the entries of A satisfy the condi- 
tion that aj; = aj; when |i — j| = 1 and a;; = 0 when |i — j| > 1. Set po(X) = 0 and, 
for each 1 < k <n, let p(X) be the characteristic polynomial of the k x k submatrix 
of A consisting of the first k rows and first k columns of the matrix XI — A € F[X]. 
Then p,(X) is the characteristic polynomial of A and we have pı(X) = X — a11 
and p(X) = (X — age) pe-1(X) — aj; DPr—2(X) for each 2 < k <n. This recursion 
relation allows us to compute the characteristic polynomial of A quickly. There- 
fore, if A is any symmetric matrix, a good strategy is to try and find a symmetric 
tridiagonal matrix similar to it and then compute its characteristic polynomial. 


Let œ be an endomorphism of a vector space V finitely generated over a field F 
and let c € spec(a). The algebraic multiplicity of c is the largest integer k such that 
(X — c)* divides the characteristic polynomial of œ. The geometric multiplicity of c 
is the dimension of the eigenspace of œ associated with c. The geometric multiplicity 
of c is not greater than its algebraic multiplicity, but these two numbers need not be 
equal, as the following examples show. If these two multiplicities are equal, we say 
that c is a semisimple eigenvalue of a; an eigenvalue which is not semisimple is 
defective. In particular, if the algebraic multiplicity of c is 1 then the same must be 
true for its geometric multiplicity. In that case, we say that c is a simple eigenvalue 
of a. If at least one eigenvalue of œ has geometric multiplicity greater than 1, then 
a is derogatory; otherwise, it is nonderogatory. 


Example If œ € End(R?) is defined by a: B a i H | then c = 1 is an eigen- 


value of œ with associated eigenspace R H and so the geometric multiplicity of 


c is 1. On the other hand, œ is represented with respect to the canonical basis by 


0 1 
algebraic multiplicity of c is 2. 


the matrix t il so its characteristic polynomial is (X — 1)*, implying that the 


Example Let æ € End(R*) be the endomorphism represented with respect to the 


2 3 1 
canonical basis by the matrix | 3 2 4 |. The characteristic polynomial of a 
0 0 -i 


is (X — 5)(X + 1)? and so spec(a) = {—1, 5}, where the algebraic multiplicity of 

—1 equals 2 and the algebraic multiplicity of 5 equals 1. The eigenspace associated 
-1 1 

with —1 is R 1 | and the eigenspace associated with 5 is R| 1 |. Thus both 
0 0 
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eigenvalues have geometric multiplicity 1. Hence, 5 is a simple eigenvalue of a 
whereas — 1 is defective. 


Let n be a positive integer. If a € End(R”) is represented with respect to a given 
basis of R” by a matrix all entries in which are positive, then Perron, using ana- 
lytic methods, showed that the eigenvalue of largest absolute value of œ is simple 
and positive, and has an associated eigenvector all entries of which are positive. 
This result has many important applications in statistics and economics, especially 
in input-output analysis. It was also used by Thurston in his classification of sur- 
face diffeomorphisms in topology. Perron’s results were later extended by Frobe- 
nius to certain matrices all entries in which are nonnegative, and later by Karlin to 
certain endomorphisms of spaces which are not finite-dimensional. In 1948, Philip 
Stein and R.L. Rosenberg used Frobenius’ extension of Perron’s results to com- 
pare the convergence rates of the Jacobi and Gauss-Seidel iteration methods for 
solution of systems of linear equations. Their results have since been considerably 
extended. 


With kind permission of the Archives of the Mathematisches Forschungsinstitut Oberwolfach (Perron, Frobe- 
nius, Thurston). 

The twentieth-century German mathematician Oskar Perron worked in many areas of al- 
gebra and geometry. Fellow German mathematician Georg Frobenius is known for his 
important work in group theory and his work on bilinear forms. He was also the first to 
consider the rank of a matrix. William Thurston is a contemporary American geometer; 
the twentieth-century American applied mathematician Samuel Karlin published exten- 
sively in probability and statistics, as well as mathematical biology. 


Proposition 12.11 Let V be a vector space finitely generated over a field F 
and let abe an endomorphism of V satisfying the condition that the charac- 
teristic polynomial of a is completely reducible. Then a is diagonalizable if 
and only if every eigenvalue of a is semisimple. 


Proof Let spec(a) = {c1,..., cx}. First of all, we will assume that there exists a 
basis D of V such that ®pp(a) is a diagonal matrix. For each 1 < j < k, denote 
by m(j) the number of times that cj appears on the diagonal of ®pp(a). Then 
3a , m(j) =n and, by Proposition 12.4, we know that for each 1 < j < k there ex- 
ists a subset of D, having m(j) elements, which is a basis for the eigenspace of a as- 
sociated with cj. Moreover, the characteristic polynomial of a is I- (X -=cj)” Q) 
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and so m(j) equals both the algebraic multiplicity and the geometric multiplicity of 
cj for each 1 < j < k, proving that each such cj is semisimple. Conversely, as- 
sume that each cj is semisimple, and for each 1 < j < k let m(j) be the algebraic 
(and geometric) multiplicity of cj. Let Dj be a basis for the eigenspace of a asso- 
ciated with c;, and let D= Us _, Dj. Then D is a linearly-independent subset of 
V having n elements, and so is a badis of V over F. The result then follows from 
Proposition 7.5. 


Example The condition in Proposition 12.11 that the characteristic polynomial of a 
be completely reducible is essential. To see this, consider the endomorphism a of R3 


O 1 0 
represented with respect to the canonical basis by the matrix A= | —1 0 0 
0 0 1 


The characteristic polynomial of a is (X — 1)(X* +1) € R[X] and so spec(a) = {1}, 

where | is a simple eigenvalue of œ and so it is surely semisimple. The eigenspace 
1 

of æ associated with this eigenvalue is R | 0 | and so its dimension is 1. Hence @ is 
0 

not diagonalizable. 


Example Consider the endomorphism a of R? represented with respect to the 


-1 -l1 -2 
canonical basis by the matrix 8 —I11 —8 | and let 6 be the endomor- 
—10 ll 7 


phism of R? represented with respect to the canonical basis by the matrix 
1 -4 -4 
8 —11 —8 |. These two endomorphisms have the same characteristic poly- 
—8 8 5 
nomial X? + 5X? + 3X — 9 = (X — 1)(X + 3)’. Thus the algebraic multiplicity 
of the eigenvalue 1 equals 1 and the algebraic multiplicity of the eigenvalue —3 
equals 2. But for a, the geometric multiplicity of —2 equals 1, so «œ is not diagonal- 
izable. On the other hand, for 6 the geometric multiplicity of —2 equals 2, and so £ 
is diagonalizable. 


Let F be a field and let (K,e) be an associative unital F-algebra. If v € K 
and if p(X) = yh ciXİ € F[X], then p(v) = ae civ! € K. For any polyno- 
mial qg(X) € F[X] we have p(v) e g(v) = q(v) e p(v). In particular, v e p(v) = 
p(v) e v. It is clear that Ann(v) = {p(X) € F[X] | p(v) = Ox} is a subspace 
of F[X]. If p(v) = Ox, we say that v annihilates the polynomial p(X). 

In particular, we note that all of the above is true for the associative uni- 
tal F-algebra My xn(F), where n is a positive integer. We note that if A ~ B 
in Mnyxn(F) then there is a nonsingular matrix P such that B = P-'AP and 
so p(B) = P-! p(A)P so that if p(A) = O then p(B) = O. Thus we see that 
Ann(A) = Ann(B) whenever the matrices A and B are similar. 
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Example Let A = i 


J € M2x2(R) and let p(X) = X? — X +2 € R[X]. Then 


p(A) = 4°- A+2I = ki s| tao = X? —2X —1 then q(A) = O so q (X) € 
Ann(A). 


Proposition 12.12 Let F be a field and let (K,e) be an associative uni- 
tal F-algebra finitely generated over F. Then Ann(v) is nontrivial for each 
veK. 


Proof Let dim(V) = n. If v e K then {v?, vl, veg} cannot be a linearly in- 
dependent set and so there exist scalars aọ,...,an, not all equal to 0, such that 
ae aivi = Ox. In other words, there exists a nonzero polynomial p(X) = 
yg ai X! in Ann(v). 


We now show why one cannot define “three-dimensional complex numbers”. 


Proposition 12.13 [fn is an odd integer greater than 1 then there is no way 
of defining on IR" the structure of an R-algebra which is also a field. 


Proof Assume that we can define an operation on R” (which we will denote by 
concatenation) which turns it into an R-algebra which is also a field, and let vı be 
the identity element for this operation. Then V # Rv; since dim(V) > 1. Pick an 
element y € V \ Rv and let æ € End(V) be given by a: v+> yv, which is repre- 
sented with respect to the canonical basis of R” by a matrix A. The characteristic 
polynomial p(X) of A belongs to R[X] and has odd degree; therefore, it has a root 
c in R. Thus p(X) = (X —c)‘q(X) for some k > 1 and some g(X) € R[X] satisfy- 
ing q(c) £0. Let B € End(V) be given by £ : v > (y — cv1)*v. Then £ ¥ oo since 
y ¢ Ry, and Oy Æ q(c) = g(cv,). But then (y — cvy)*g(cvy) = 0y, contradicting 
Proposition 2.3(12). 


Let F be a field and let (K, e) be an associative unital F-algebra. If v € K sat- 
isfies the condition that Ann(v) is nontrivial then Ann(v) must contain a polyno- 
mial p(X) = } ;—o4iX ‘ of minimal degree. This means, in particular, that an 4 0 
and so the monic polynomial a, ' p(X) also belongs to Ann(v). We claim that it 
is the unique monic polynomial of minimal degree in Ann(v). Indeed, if q (X) is 
a monic polynomial of degree n belonging to Ann(v) not equal to a, l p(X), then 
r(X) =q(X) - a; | p(X) € Ann(v). But deg(r) < n, contradicting the minimality 
of the degree n of p(X). Thus we see that Ann(v), if nonempty, contains a unique 
monic polynomial of minimal positive degree, which we call the minimal polyno- 
mial of v over F and denote by m,(X). 
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Example We know that C is an associative unital R-algebra. If c=a+bieCrR, 
then its minimal polynomial over R is (X — c)(X —¢) = X? — 2aX a +b’). 


In particular, if F is a field and if n is a positive integer, then any matrix 
A € Mnxn(F) has a minimal polynomial, which we denote by m4 (X). If A and 
B are similar matrices, then m4 (X) = mg(X). Similarly, if V is a vector space 
finitely generated over a field F, and if a € End(V) then œ has a minimal polyno- 
mial ma(X), and this equals the minimal polynomial of ®pp(a) for any basis D 
of V. If f(X) € F[X] then it is easy to see that f (X) = mcompcf)(X) and so every 
polynomial is the minimal polynomial of some matrix. 


Example Let (K, e) be an associative unital entire R-algebra. Assume that v € K 
has a minimal polynomial m,(X) € R[X]. By Proposition 4.4, we know that 
m,(X) = T pi(X), where the p;(X) are irreducible polynomials of degree at 
most 2. But then Ii- pi(v) = Ox and, since K is entire, there is some index h 
such that pp (v) = Ox. By minimality, this means that m,(X) = py,(X). We thus 
conclude that any element of v having a minimal polynomial has one of degree at 
most 2. 


Proposition 12.14 Let F be a field and let (K,e) be an associative 
F-algebra finitely generated over F. If v e K satisfies the condition that 
Ann(v) is nontrivial and if p(X) € Ann(v), then there is a polynomial 
q(X) € F[X] satisfying p(X) = my(X)q(X). 


Proof If p(X) is the 0-polynomial, pick u(X) to be the 0-polynomial, and we are 
done. Therefore, assume that deg(p) > 0. From Proposition 4.2, we know that we 
can write p(X) = my(X)q(X) +r(X), where g(X), r(X) € F[X], with deg(r) < 
deg(m,). Since p(v) = Ox, we see that Ox = m,(v) eg(v) + r(v) = r(v). Since 
deg(v) < deg(m 4), we must have deg(v) = —ov, and so p(X) = m,(X)q(X). 


With kind permission of the Archives of the Mathematisches Forschungsinstitut Ober- 
wolfach. 


This fundamental result was first established at the beginning of the 
twentieth century by the German mathematician Kurt Hensel. 
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Proposition 12.15 Let F be a field and let (K,¢) be an associative unital 

F-algebra with multiplicative identity e. If v e K has a minimal polynomial 

my(X) = ;_oaiX' then: 

(1) v is aunit of K if and only if ay 4 0; and 

(2) Ifv is aunit of K then vl= g(v), where g(X) = X; (az ax E€ 
F[X]. 


Proof If ag # 0 then m,(v) = Ox implies that e = ay '[- Xo jai] = 
ay '[- avi '} e v = g(v) ev = v è g(v) and so v is a unit and v™! = 
g(v). Conversely, assume that v is a unit. Had we ag = 0, we would have 
Oy = m, (v) = ve [X ;_; a;vi™!] and so Ox = v™!m, (v) = 7, ajvi™!. Thus 
Fj aiXi7! € Ann(v), contradicting the minimality of the degree m, (X). Hence 
do Æ 0. 


It is important to note that the minimal polynomial of a matrix over a field need 
not equal its characteristic polynomial. For example, if we consider I € Myxn(F) 
for any field F and any integer n > 1, then the characteristic polynomial of J is 
(X — 1)” whereas its minimal polynomial is X — 1. 


1 1 
0 0 
polynomial X (X — 1), and this is in fact its minimal polynomial. It is also the char- 
acteristic polynomial of A. Thus we see that the minimal polynomial of a matrix 
does not have to be irreducible. Notice too that the rank of A equals 1, but the de- 
gree of its minimal polynomial is 2. Thus the degree of the minimal polynomial of 
a matrix may be larger than its rank. 


Example Let F be a field. The matrix A = € Mox2(F) annihilates the 


1 0 0 
Example Let F bea field. The matrix A= | 0 0 0 | € M3x3(F) annihilates the 
0 0 0 


polynomial X (X — 1), and this is in fact its minimal polynomial. The characteristic 
polynomial of A is X?(X — 1). 


1 0 0 1 0 0 

Example One can check that | O0 3 0}, |O 1 0f € M3 x3(Q) are not 
0 0 3 0 0 3 

similar, but they both have the same minimal polynomial, namely (X — 1) - 

(X — 3). 


Example Proposition 12.15 can be used to calculate the inverse of a nonsingular 

matrix, though it is rarely the most efficient method of doing so. For example, the 
2 -2 4 

matrix A = 2 3 2 | has minimal polynomial X? — 4X? + 7X — 10=0 
-1 1 -l 
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SO 
i ¿ [-5 2 -16 
A= MAH) 02 4 
: “(ao 10 


Proposition 12.16 (Cayley-Hamilton Theorem) Let F be a field and let n 
be a positive integer. Then every matrix in Mpxn(F) annihilates its charac- 
teristic polynomial. 


Proof Let A be a matrix in Mnxn(F) having minimal polynomial p(X) = X” + 

a;X'. Let us look at the matrix [g;;(X)] = adj(XI — A) € Mnxn(F[X]), 
where each g;;(X) is a polynomial of degree at most n — 1. Then we can write this 
matrix in the form ear B,X"~', where the B; are matrices in Maxn(F). More- 
over, we know that 


p(X)I =|XI — All = (XI — A)adj(XI — A) = (XI — A) (È axe) 
i=l 


Equating coefficients of the various powers of X, we thus see 
Bi =], 


By — AB, =ay,_1l, 
B3 — AB? = ay_2l, 


Bn — ABy-) = ay, 
—AB, = aol. 


For 1 < h <n, multiply both sides of the hth equation above on the left by A”+!~” 
and then sum both sides, to obtain O = p(A). 


We see from Proposition 12.14 and Proposition 12.16 that the minimal polyno- 
mial of any n x n matrix over a field divides its characteristic polynomial and so the 
degree of the minimal polynomial is at most n. 

Let V be a vector space finitely generated over a field F and let o9 4 œ € End(V). 
In Proposition 12.4, we saw that œ is diagonalizable if and only if there is a basis 
that is composed of eigenvectors of V. Moreover, if spec(a) = {c1,..., cg} and 
if, for each 1 <i < k, we denote the eigenspace of œ associated with c; by W;, 
then for each 1 <i < k we have a projection x; € End(V) satisfying the following 
conditions: 

(1) im(z;) = Wi; 
(2) m1 +: + An =01; 
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(3) mim; = og whenever i Æ j; 

(4) a=cym, +- + ce. 

For each 1 < h < k, let py(X) be the hth Lagrange interpolation polynomial de- 
termined by cj,...,cg. Then we can check that mp = pp(œ) for each h, since 
Pn(X)(X — cp) is just a scalar multiple of the minimal polynomial of a. 

Is it possible to simultaneously diagonalize two distinct endomorphisms of V? 
Indeed, let V be a vector space finitely generated over a field F and let a and 
B be distinct elements of End(V) \ {oo}. There exists a basis D of V such that 
both ®pp(a@) and ®pp(f) are diagonal matrices if and only if the elements of D 
are eigenvectors of a as well as of 6. Suppose that we have in hand such a basis 
D = {u1,..., uk}. Since diagonal matrices commute with each other, we see that 
Pppv(aB) = Ppp(@) Ppp (B) = Gpp(B)Ppp(a@) = Ppp(Ba) and so af = fa. 
Therefore, a necessary condition for both endomorphisms of V to be represented by 
diagonal matrices with respect to the same basis is that they form a commuting pair. 

We also note that if D is a basis for a vector space V over a field F then the set 
of all endomorphisms œ of V satisfying the condition that ®pp(a) is a diagonal 
matrix is a subspace of End(V). Indeed, this is an immediate consequence of the 
fact that the set of all diagonal n x n matrices is a subspace of Myxn(F). 


Proposition 12.17 Let V be a vector space over a field F and let a, B be a 
commuting pair of endomorphisms of V . Then p(a)q(B) = q(B) p(@) for any 
P(X), q(X) € F[X]. 


Proof Initially, we will consider the special case of g(X) = X. If p(X) = 
X; oai Xİ then Ba? = (Ba)a = (aB)a = a(Ba) = a (ap) = «7B, and, by induc- 
tion, we similarly have Ba* = a* B for every positive integer k. Therefore 


ppa) = (È va!) =} apo =} aia p = 2 aa’) = p(a)B. 


i=0 i=0 i=0 i=0 


Now a proof similar to the first part shows that p(œ)8* = B* p(a) for every positive 
integer k and hence, by a proof similar to the second part, we get p(a)q(B) = 
q(B)p(a) for any p(X), q4 (X) € FIX]. 


As a consequence of this we note that if œ, 8 e End(V) are commuting projec- 
tions then (aB)* = (aB)(aB) = a(Ba)B = a (aß) = a? B? = aß and so af is a 


projection as well. 


Proposition 12.18 Let V be a vector space finitely generated over a field F 
and let œ, B € End(V) be diagonalizable endomorphisms of V . Then there 
exists a basis of V relative to which both œ and B can be represented by 
diagonal matrices if and only if aB = Ba. 
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Proof We have already noted that if œ and $ can both be represented by diagonal 
matrices with respect to a given basis of V then we must have wf = Ba. Con- 
versely, assume that œ and f are diagonalizable endomorphisms of V satisfying 


aß = Ba. Then, as we have already seen, there exist distinct scalars cj,..., cx 
and projections 7 ,..., 7% E€ End(V) such that m1 + +--+ Tk = 01, ™j1j = 00 
for i Æ j, and cım +--+: + cya, =a. Similarly, there exist scalars d,...,d; 


and projections n1, ..., ng E€ End(V) such that nı + --- + = 01, ninj = oo for 
i £ j, and din; +---+dgng = B. Therefore, a = ao, = ODE CEDO j=1 nj) = 
k k 
dist Xj- ciminj and B = fo; = j= din) È i17) = Xj- Pi dinji. 
Since we saw that for each 1 <i < k we have x; = p; (œ) for some p;(X) € F[X] 
and similarly for each 1 < j < t we have n; = qj (£) for some g;(X) € F[X], we 
conclude that 7z;ņnj = nj; for each such i and j. Call this common value 6;;. By 
the comments after Proposition 12.17, we see that 6;; is also a projection in End(V). 
We note that 6;;@rm = inj Thm = Tin; Nm and this equals og when i # j or 
h +m. Thus E yi ij = Si LOODA nj) = 01. Hence we have shown 
that æ and £ are simultaneously diagonalizable, using those projections 0;; which 
are nonzero (as some of them may be zero). 


We now turn to another classical result. 


Proposition 12.19 Let V be a vector space over a field F and let K bea 
subalgebra of End(V) such that there is no nontrivial proper subspace of 
V which is invariant under every a € K. Suppose that B € End(V) has a 
nonempty spectrum and commutes with every element of K . Then B = oc for 
somece F. 


Proof Pick c € spec() and let W be the eigenspace of $ associated with c. This is 
a nontrivial subspace of V. If a € K and w € W then Ba(w) = æf (w) = a(cw) = 
ca(w) and so a(w) € W. Thus W is a nontrivial subspace of W invariant under 
every a € K and so, by assumption, it cannot be proper. Therefore, W = V and so 


B=o. 


Recall that if the field F is algebraically closed then any element of End(V) other 
than oo has a nonempty spectrum. 


Proposition 12.20 Let V be a vector space over an algebraically-closed field 
F and let K be a unital subalgebra of End(V) such that there is no nontrivial 
proper subspace of V which is invariant under every a € K. Let v € V and let 
W be a finitely-generated subspace of V satisfying the condition that if a € K 
and W Cker(a@) then v € ker(a). Then v € W. 
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Proof We will prove the result by induction on n = dim(W). If n = 0 then W 
is trivial. Since K is unital, o} € K and W C ker(o,). Therefore, by hypothesis, 
v € ker(o,) and so v= Oy E€ W. 

Now assume, inductively, that n > 1 and that the result has been established for 
all subspaces of V of dimension less than n. Pick Oy 4 wo € W and let W; be a 
complement of Fwo in W. Set L = {a € K | W; C ker(a)}. This set is nonempty 
since og € L. Moreover, it is in fact a subspace of L as a vector space over F. 
Moreover, if a € L and £ € K then Ba € L, so in particular L is a subalgebra of K. 
Moreover, Y = {æ (wọ) | œ € L} is a subspace of V. 

Since wo ¢ Wi, we know that there exists an element œọ of L satisfying 
wo ¢ ker(ao) and so Y is nontrivial. However, p(y) € Y foreach y € Y and £ € K. 
Thus Y is invariant under every element of K and so, by hypothesis, Y = V. Define 
the function 0 : V > V by 0 : (wọ) œ> a(v). This function is well-defined for if 
&ı (wo) = œ2 (wọ) then wọ € ker(a; — œ2) and so W C ker(œı — œ2). Hence, by as- 
sumption, v € ker(œ;ı — a2), i.e., æı (v) = œ2 (v). It is straightforward to check that 
in fact 0 € End(V). 

If B € K then (0$) (œ (w0)) = 0 (Pa (w0)) = Pa (v) = B(a(wo)) = (BA) (œ (wo)) 
and so 0 commutes with every element of K . By Proposition 12.19, this implies that 
0 = dc for some c € C. Thus, for any a € K we have a(v) = 0a (wọ) = ca (wọ) = 
a(cwo) and so a(v — cwo) = Oy. By the induction hypothesis, this implies that 
v — cwo € Wo and so v € W, as desired. 


Proposition 12.21 (Burnside’s Theorem) Let V be a vector space finitely 
generated over an algebraically-closed field F and let K be a unital subal- 
gebra of End(V) the elements of which commute with all endomorphisms of 
the form oc for c € F. Assume furthermore that there is no nontrivial proper 
subspace of V which is invariant under every a € K. Then K = End(V). 


Proof Pick a basis {v1,..., Un} for V over F and, for all 1 <i, j < n, let 6;; be the 
endomorphism of V defined by the condition that 


ou A ifk =i, 

ij Ek Oy otherwise. 
This is a basis for End(V) and so it suffices to show that 6;; € K forall 1 <i, j <n. 
Fix i € {1,...,} and let 


Li = {a € K | (vr) = Oy for all h Æi}. 


By Proposition 12.20, there is an element ao of L; satisfying ao(v;) Æ Oy and so, as 
in the proof of that proposition, we see that {æ (vp) | œ € L;} equals V. In particular, 
if 1 < j <n there exists an element 6; of L; satisfying Bj(v;) = vj. Thus 6;; = 
Bj EK. 
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With kind permission of the Archives of the Mathematisches 
Forschungsinstitut Oberwolfach (Tate). 

The British mathematician William Burnside pub- 
lished important works on group theory at the end 
of the nineteenth century. Burnside’s original result 
has been extensively generalized. The above proof 
is based on the proof of one such generalization, 
by the twentieth-century American mathematician 
John Tate. 


Proposition 12.21 holds for the case of F = C. If the field F is not algebraically 
closed, this theorem may not hold. 


Example Let F = R and let a be the endomorphism of R? defined by  : | = 


E Then œ? = —o; and so K = {ca + coy | c € R} is a proper subalgebra of 
End(R*) for which there are no nontrivial proper subspaces of R? invariant under 
every element of K. 


Algorithms for the computation of the eigenvalues and eigenvectors of a given 
matrix are usually very complicated, especially if speed of computation is a major 
consideration. Therefore, we shall not go into the description of such algorithms 
in detail. As a rule of thumb, it is best to try to compute eigenvectors directly, and 
not through finding roots of the characteristic polynomial, since small errors in the 
computation of eigenvalues may often lead to large errors in the computation of the 
corresponding eigenvectors. For matrices over R, there are often reasonably effi- 
cient iterative methods to find at least some of the eigenvectors. We will bring here 
one example to find an eigenvector associated with the real eigenvalue of a ma- 
trix over R having greatest absolute value (often called the dominant eigenvalue), 
under assumption that such an eigenvalue indeed exists. The algorithm is based 
on the observation that if c is an eigenvalue of a matrix A € Mnxn(R) then ck 
is an eigenvalue of A‘. Hence, if k is sufficiently large, the matrix A(A*) is ap- 
proximately equal to cA‘. Therefore, if we select an arbitrary vector v® € R” and 
successively define vectors uv), v™,... by setting v’+) = Av for each i > 0, 
then Av = A*+!y© and this is roughly equal to cv. So, if the circumstances 
are amenable (and we will not go into the precise conditions necessary for this to 
happen), the vector v“) is a reasonable approximation to an eigenvector of A as- 
sociated with c. Of course, we must always remember that repeated computations 
lead to accumulating roundoff and truncation errors; one way of combating these is 
to divide each entry in v by the absolute value of the largest entry, and use this 
“normalized” vector in the next iteration. 
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With kind permission of NPL (Wilkinson); with kind permis- 
sion of the Archives of the Mathematisches Forschungsinstitut 
Oberwolfach (von Mises). 

Of the many numerical analysts who studied com- 
putational methods for finding eigenvalues, one of 
the most important is the British mathematician 
James H. Wilkinson, a former assistant of Alan 
Turing and one of the major early innovators in nu- 
merical linear algebra. The iteration algorithm given here was first studied in the 1920s by 
the Austrian applied mathematician Richard von Mises, who later emigrated to the United 
States. 


After one calculates the dominant eigenvalue of a matrix in My x»(R), there 
are various techniques, known as deflation techniques, for creating a new matrix in 
M (n—1)x(n—1) (R) the eigenvalues of which are the same as all of the eigenvalues of 
the original matrix, except for the dominant eigenvalue. 


5 1 


3 4 € M2 x2(R) and let us pick v® = p then 


Example Consider A = | 


1 
Av® = fa] , and so we will take vD = | | ; 


av = 51 iol and so we will take =| 3]; 
Av = =I 36: and so we will take =| a|: 
av =| Se: and so we will take =| a|: 
Av Aa and so we will take =| al; 


It seems that this sequence of vectors is converging to and, indeed, one 


1 
-1 
can check that this is an eigenvector of A associated with the eigenvalue 4. 

Again, preconditioning can be used to make iterative methods for finding eigen- 


values converge more rapidly. 


Example Let n be a positive integer and let A E€ Mnxn(R) be a matrix of the form 
[cB +(1—c)D]", where B € Myx. (R) is a Markov matrix, c € R satisfies 0 < c < 


1 dı 
1,and D= | : | Af : | for nonnegative real numbers d; satisfying }`;_; di = 1. 
1 dn 
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Such matrices have been called Google matrices since they are needed for the Page- 
Rank algorithm used by the internet search engine Google™ to compute an estimate 
of webpage importance for ranking search results (for these purposes, a typical value 


for c is 0.85). The value of n can be very large, often far larger than 10°. 


One can show that the eigenvalues e1, ..., en of such a matrix satisfy 1 = |e;| > 
|e2| >--- > |en| = 0, and so the power method mentioned above can be (and is) used 
by Google to rapidly compute an eigenvector associated to e1. Stanford University 
researchers Taher Haveliwala and Sepandar Kamvar have shown that for any Google 
matrix, |e2| < c, with equality happening under conditions that hold in the case of 
those matrices arising in this particular application. Eigenvectors corresponding to 
this second eigenvalue can be used to detect and combat link spamming on the 
internet. 

One can also consider various generalizations of the eigenvalue problem. Thus, 
for example, given endomorphisms «œ and £ of a vector space V, one can seek to find 
all scalars c such that cB — œ is not monic. Problems of this sort arise naturally, for 
example, in plasma physics and in the design of control systems. Very often, such 
problems can be formulated as a matter of minimizing the largest generalized eigen- 
value of a pair of symmetric matrices. When £ is an automorphism, as is usually the 
case, such generalized eigenvalue problems can be reduced to the usual eigenvalue 
problem for the endomorphism 6~!a, but there are often reasons for not wanting 
to do so. For example, even if both œ and £ are represented with respect to a given 
basis by symmetric matrices, the matrix representing 6~!a may not be symmetric. 
Therefore, some specialized algorithms have been developed to find solutions of the 
generalized eigenvalue problem directly. 

If V has finite dimension n and the endomorphisms œ and £ are represented 
with respect to some basis by matrices A and B, respectively, one can look at the 
generalized characteristic polynomial |X B — A|. Problems arise, however, since the 
degree of this polynomial may be less than n, if the matrix B is singular. In fact, this 
polynomial may even be the 0-polynomial. 

A further generalization of the eigenvalue problem is the following: Given en- 
domorphisms ao,...,@, of V, find all scalars c such that the endomorphism 
Diao cla; is not monic. Various techniques have been developed to handle this 
problem directly in special cases. Also, it can sometimes be reduced to the case of 
n = 1. For example, finding a vector Oy Æ v € V in the kernel of c*a2 + cary + æo is 


equivalent to finding a nonzero element of the form | in the kernel of cB; — Bo, 


where the 8; are the endomorphisms of V? defined by 


|x a1 (x) + oy) | x —a7(x) 
wfe] ao(y) | a apl [oa]. 
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Exercises 


Exercise 738 

Let n be a positive integer and let A E€ Myxn(Q). Let c1,..., cn be the list of 
(not necessarily distinct) eigenvalues of A, considered as a matrix in My xn (C). 
Show that )°?_, c; and []j_, c; are rational numbers. 


Exercise 739 

Let F be a field, let n be a positive integer, and let A, B E€ Mnxn(F). Assume 
that A and B have the same characteristic polynomial p(X) € F[X]. Is it neces- 
sarily true that p(X) is the characteristic polynomial of AB? 


Exercise 740 
Find infinitely-many matrices in M3,.3(R), all of which have characteristic poly- 
nomial X (X — 1)(X — 2). 


Exercise 741 


3 2 
Find the characteristic polynomial of | 1 4 1 | €M3x3(R). 
2 1 


Exercise 742 
Let a, b,c € R. Find the characteristic polynomial of the matrix 


0 00a 
a 0 0 b 
oboc € M4x4(R). 
0 0c 0 


Exercise 743 
Let n be a positive integer. Show that every matrix A € My (IR) can be written 
as the sum of two nonsingular matrices. 


Exercise 744 

Let F = GF(3) and let n be a positive integer. Let D = [dij] E€ Mnxn(F) bea 
nonsingular diagonal matrix and let A E€ Mnxn(F). Show that 1 ¢ spec(DA) if 
and only if D — A is nonsingular. 


Exercise 745 

Let F be a field of characteristic other than 2. For each positive integer n, let 
T, be the set of all diagonal matrices in M,. (IR) the diagonal entries of which 
belong to {—1, 1}. For any A € Mx, (IR), show that there exists a matrix D € Ta 
satisfying 1 ¢ spec(DA). 
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Exercise 746 
Let n be a positive integer and let a : Myxn(C) > C” be the function defined 
ao 


bya: Arh : |, where X” + ee a; X' is the characteristic polynomial 


an-\ 
of A. Is aw a linear transformation? 


Exercise 747 


a a—b 
Define aw € End(R+) bya: | b | | a+2b+c |. Find the eigenvalues of a 
c —2a+b—c 


and, for each eigenvalue, find the associated eigenspace. 


Exercise 748 

Let A is a nonempty set and let V be the collection of all subsets of A, which is 
a vector space over GF(2). Let B be a fixed subset of A and let a: V > V be 
the endomorphism defined by a: Yt Y N B. Find the eigenvalues of œ and, for 
each eigenvalue, find the associated eigenspace. 


Exercise 749 

Let V be a vector space over a field F and let œ be an endomorphism of F. Show 
that the one-dimensional subspaces of V invariant under a are precisely those of 
the form Rv, where v is an eigenvector of a. 


Exercise 750 


a b+c 
Define a € End(R*) by a: : a o . Find the eigenvalues of œ. Do 
d 0 


there exist two-dimensional subspaces W and Y of R4, both invariant under a, 
such that Rt = W @ Y? 


Exercise 751 

Let V be a vector space finitely generated over a field F and let a be an endo- 
morphism of V having an eigenvalue c. For any p(X) € F[X], show that p(c) is 
an eigenvalue of p(a). 


Exercise 752 

Let V be the vector space of all functions in R® which are infinitely differentiable 
and let œ : V —> V be the endomorphism of V defined bya: fr f”. Ifn >0 
is an integer, show that the function f : x +> sin(nx) is an eigenvector of a and 
find the associated eigenvalue. 
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Exercise 753 

Let F be a field and let V = F™. Let æ be the endomorphism of V defined by 
a(f):it> f(i + 1) for all i € Z. Determine whether spec(@) is nonempty or 
not. 


Exercise 754 

Let V be the vector space composed of all polynomial functions from R to itself, 
let a € R, and let œ be the endomorphism of V defined by a(p): xh (x — 
a)[p’ (x) + p’(a)] — 2[ p(x) — p(a)], where p’ denotes the derivative of p. Find 
the eigenvalues of œ and for each such eigenvalue, find the associated eigenspace. 


Exercise 755 
Let a be the endomorphism of M2x2(R) defined by 


Ja b A d —b 
“le d —c aj 
Find the eigenvalues of a and for each such eigenvalue, find the associated 
eigenspace. 


Exercise 756 
Let V be a vector space over Q and let a € End(V) be a projection. Show that 
spec(a) C {0, 1}. Is the converse true? 


Exercise 757 

Let V be a vector space of dimension n > 0 over a field F. Let œ be an endomor- 
phism of V for which there exists a set A of n + 1 distinct eigenvectors satisfying 
the condition that every subset of A of size n is a basis for V. Show that all of 
the eigenvectors in V are associated with the same eigenvalue c of œ and that 
dæ = c0]. 


Exercise 758 


For a,b € R, let A = € M3x3(R). Find the eigenvalues of A. 


Ora 
“a cœ 
a Sco 


Exercise 759 
Let A € M2x2(R) be a matrix of the form f A where a > 0 and bc > 0. 


Show that A has two distinct eigenvalues in R. 


Exercise 760 


5 6 -3 
Find the eigenvalues of the matrix | —1 0 1 | € M3x3(R) and, for each 
22 -1 


such eigenvalue, find the associated eigenspace. 
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Exercise 761 


0 2 1 
Find the eigenvalues of the matrix | —2 0 3 | € M3x3(C) and, for each 
-1 -3 0 


such eigenvalue, find the associated eigenspace. 
Exercise 762 


Let W be the subspace of R consisting of all convergent sequences and let œ 
be the endomorphism of W defined by 


æ : [a1, a2, ...] => | (lim ai) — a], (lim ai) —ay,.. af 
I> Co 1—>0o 


Find all eigenvalues of œ and, for each eigenvalue, find the corresponding 
eigenspace. 


Exercise 763 


1 -1 1 
Find the eigenvalues of the matrix | 1 0 0 | in M3x3(C) and, for each 
0 1 0 


such eigenvalue, find the associated eigenspace. 


Exercise 764 
Does there exist a real number a such that 


1 -l 0 
spec 0 a -l = {—2, —1, 0}? 
—6 11 -5 


Exercise 765 

Let œ be an endomorphism of a vector space V over a field F and let v and w be 
eigenvectors of œ. If v + w Æ Oy, show that v + w is an eigenvector of œ if and 
only if both v and w correspond to the same eigenvalue. 


Exercise 766 
a 


1 0 

Show that the matrix A=|a a a | has three distinct eigenvalues for any 
a 0 -l 

real number a. 


Exercise 767 

Let n be a positive integer and let t be a nonzero real number. Let A € Mn xn (R) 
be the matrix all of the entries of which equal t. Find the eigenvalues of A and, 
for each such eigenvalue, find the associated eigenspace. 


Exercise 768 
Let n be a positive integer and let F be a field. Let A be a nonsingular matrix in 
Mnxn(F). Given the eigenvalues of A, find the eigenvalues of ATI. 
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Exercise 769 
Let n be a positive integer and let F be a field. Let A = [aij] E€ Mnxn(F), and 
let c € spec(A). If b, d € F, show that bc + d € spec(bA + dI). 


Exercise 770 
Let A = [i A € M2x2(R). If t € R is a root of the polynomial bX? + 


(a — d)X — c € R[X], show that [i is an eigenvector of A associated with 


the eigenvalue a + bt. 

Exercise 771 

Let A € M2x2(C) be a matrix having two distinct eigenvalues. Show that there 
are precisely four distinct matrices B € M2x2(C) satisfying B? =A. 


Exercise 772 


a 0 0 
Find all a € R such that | 2a 2a 2a | has a unique eigenvalue. 
0 0 a 


Exercise 773 
Find a real number a such that the only eigenvalue of the matrix 


a 1 0 
—1 0 -l1 E€ M3x3(R) 
0 1 —a 
is 0. 
Exercise 774 
1 
For each 1 <i < 3 and 2 < j < 3, find a real number a;j such that | —1 |, 
0 
1 1 1 ay ay 
0], and | 1 | are all eigenvectors of the matrix | 1 an az | € 
-] 1 1 az a33 


M3x3(R). 


Exercise 775 

Let 0#r € C and let n and m be positive integers. Let A = [ajj] E€ Mnxn(C) 
be given and let B = [bij] E€ Mnxn(C) be the matrix defined by bij = play 
for all 1 <i, j < n. Show that if d € C is an eigenvalue of A then r”d is an 
eigenvalue of B. 


Exercise 776 
Let n be a positive integer and let F be a field. A matrix A E€ Mnxn(F) is a 
magic matrix if and only if there exists a scalar c € F such that the sum of the 
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entries in each row and each column is c. Characterize magic matrices in terms 
of their eigenvalues. 


Exercise 777 
Let A € M2x2(C) be a matrix having distinct eigenvalues a 4 b. Show that, for 
all n > 0, 


n n 


At bee S E, 
~a—b baa F 


Exercise 778 
Let A € M2x2(C) be a matrix having a unique eigenvalue c. Show that 
A" = c""![nA — (n — 1)cI] for all n > 0. 


Exercise 779 
Let n be a positive integer and let A € Mnxn(C). Show that every eigenvector 
of A is also an eigenvector of adj(A). 


Exercise 780 

Let n be a positive integer. Let G be the set of all matrices A E€ Mnxn(C) sat- 
isfying the condition that C” has a basis composed of eigenvectors of A. Is G 
closed under taking sums? Is it closed under taking products? 


Exercise 781 
Let p(X) € C[X] and let A € Mnxn(C) for some positive integer n. Calculate 
the determinant of the matrix p(A) using the eigenvalues of A. 


Exercise 782 


= 2 
Let —1 ża € R and let A= [' one : ; “| € M2x2(R). Calculate A” 


a-—a 
for alln > 1. 


Exercise 783 
Let n be a positive integer. Given a matrix A € My xn(Q), find infinitely-many 
distinct matrices having the same eigenvalues as A. 


Exercise 784 


1 0 0 0 
: : 0 Oc O 

Let c € R. Find the spectral radius of A = 0 -- 0 0 E€ Max4(C). 
0 0 0 0 


Exercise 785 
Let A € My xn(R) be a matrix all entries in which are positive and let c be a 
positive real number greater than the spectral radius of A. Show that |cJ — A| > 0. 
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Exercise 786 
Let n be a positive integer. Show that 1 is an eigenvalue of any Markov matrix in 


Mnxn (R). 


Exercise 787 
Let n be a positive integer. Let A = [aij] € Mnxn(R) satisfy the condition that 
SS |aij| < 1 for all 1 < j <n. Show that |c| < 1 for all c € spec(A). 


Exercise 788 

Let n be a positive integer and let F be a field. Let A = [aij] € Mnxn(F) be a 

matrix satisfying the condition that the sum of the entries in each row equals 1. 
by 

Let 17 #c € spec(A) and let | : | be an eigenvector of A associated with c. 


bn 
Show that } `; bj =0. 


Exercise 789 
Give an example of a matrix A € Mp2, 2(R) satisfying the condition that 
spec(A) = Ø but spec(A*) ASD. 


Exercise 790 

Find an example of matrices A, B € M2x2(R) satisfying the condition that ev- 
ery element of spec(A) U spec(B) is positive but every element of spec(A B) is 
negative. 


Exercise 791 
Find a polynomial p(X) € C[X] of degree 2 satisfying the condition that all 


l-a 
matrices in M C) of the form 
2x2( ) | pla) A 


, for a € C, have the same char- 


acteristic polynomial. 


Exercise 792 

Let F be a field and let n be an even positive integer. Let A, B E€ Mnxn(F) be 
matrices satisfying A = B*. Let p(X) be the characteristic polynomial of A and 
let q (X) be the characteristic polynomial of B. Show that p(X?) = q(X)q(-X). 


Exercise 793 
Let F be a field. Characterize the matrices in M2,2(F) having the property that 
their characteristic polynomial is not equal to their minimal polynomial. 


Exercise 794 
Let (K, e) be an associative unital F'-algebra, let v € K, and let œ : K —> K bea 
homomorphism of F-algebras. Show that Ann(v) C Ann(@(v)). 
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Exercise 795 


Are the matrices | : d and E 


3 | in M2x2(R) similar? 


Exercise 796 


1 i 0 1+i 7 2 
Are the matrices | i 2 —1 | and 0 1 9 in M3x3(C) similar? 
0 i 0 0 2-i 


Exercise 797 
Let n be a positive integer and let A, B € My x,(R). Show that if A and B 
are similar when considered as elements of My x»(C), they are also similar in 


Man xn(R). 


Exercise 798 


1 0 1 
Find a diagonal matrix in M3x3(R) similar to the matrix | 0 1 

1 0 1 
Exercise 799 

0 0 1 
Find a diagonal matrix in M3x3(R) similar to the matrix | 0 0 0 

1 0 0 


Exercise 800 
Is there a diagonal matrix in M3x3(R) similar to the matrix 


8 2 .=3 
—6 -l 3 |? 
12 6 —4 


Exercise 801 
Show that every matrix in the subspace of M2 2(R) generated by 


O 1 1 0 
1 0/°]O -1 
is similar to a diagonal matrix. 


Exercise 802 


in M3 x3(GF(5)) are 


= o oo 
ooo 


0 1 0 0 
Determine if the matrices | 0 O 1 | and | 1 
0 0 0 0 


similar. 


Exercises 291 


Exercise 803 
Is there a diagonal matrix in M3x3(R) similar to the matrix 


i =1 2 
-2 Gd B19 
-2 ad 4 


Exercise 804 


1 0 0 
Le A=]0O 1 1 | €e M3x3(R). Find a nonsingular matrix P € M3x3(R) 
0O 1 1 


such that PT!AP is a diagonal matrix. 


Exercise 805 


Show that the matrix A = E i | € M2 x2(C) is not similar to a diagonal 


1 
matrix. 
Exercise 806 
1 -1 0 2 0 0 
Are the matrices | 0 2 5ļ|and|—1 4 0 |in M3x3(Q) similar? 
0 0 3 03 7 


Exercise 807 
Let k and n be positive integers and let F be a field. Let A E€ Mkxn(F) and 


. | AB OJ. . . lO O 
9 
Be Mnxk(F). Is the matrix | B d similar to the matrix E A ? 


Exercise 808 


Let F be a field and let A = 


ooe;s 


1 0 
a 1 | €M3x3(F). For any p(X) € F[X], 
0 a 


pa) pia) 5p"(a) 
show that p(A) = 0 pla) p'(a) |, where p’(X) denotes the formal 
0 0 p(a) 
derivative of the polynomial p(X) and p”(X) is the formal derivative of p’(X). 


Exercise 809 
Find the characteristic and minimal polynomials of the matrix 


7 4 4 
4 -8 —1 | € M3x3(R). 
af of -8 
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Exercise 810 

Let n be a positive integer. Let V be the vector space over R consisting of all 
polynomial functions from R to itself having degree at most n. Let œ be the en- 
domorphism of V which assigns to each f e V its derivative, and let A be a 
matrix representing a with respect to some basis of V. Find the minimal polyno- 
mial of A. 


Exercise 811 


Find six distinct matrices in M2x2(R) which annihilate the polynomial X Zaj, 


Exercise 812 
Let n be a positive integer and let c be an element of a field F. Find a matrix 
A € Mnxn(F) having minimal polynomial (X — c)”. 


Exercise 813 


Use the Cayley—Hamilton Theorem to find the inverse of 


5 1 -l 
—6 0 2 | € M3x3(R). 
0 0 2 


Exercise 814 
Let n be a positive integer and let A E€ Myx, (F) be a matrix of rank A. Show 
that the degree of the minimal polynomial of A is at most h + 1. 


Exercise 815 
Let n be a positive integer and let F be a field. Show that a matrix A E€ Myxn(F) 
is nonsingular if and only if m 4 (0) 40. 


Exercise 816 


001 1 
z ‘ : 001 0 : 
Find the eigenvalues of the matrix 010 0 € M4 x4(Q) and determine 
1 1 0 0 
the algebraic multiplicity of each. 
Exercise 817 
O 1 0 
Find the minimal polynomial of the matrix | 0 0 1 | inM3x3(R). 


I! 3: =3 
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Exercise 818 
Let a and £ be the endomorphisms of Q* represented with respect to the canon- 


1 0 -1 0 4 -1 -l 0 

. : ; 0 1 0 -1 -1 4 0 -Ii 

ical basis by the matrices 10 1 0 and I 0 2 q [te 
0 1 0 -l 0 1 -l 2 


spectively. Does there exist a basis of Q* with respect to which both of them can 
be represented by diagonal matrices? 


Exercise 819 
Let œ be the endomorphisms of R? represented with respect to the canonical 


-6 2 -5 
basis by the matrix 4 4 —2 |. Calculate the algebraic and geometric 
10 —3 8 


multiplicities of each of the eigenvalues of a. 


Exercise 820 
Let A = [aij] E€ Mnxn(C) be a symmetric tridiagonal matrix having an eigen- 
value c with algebraic multiplicity k. Show that a;—1,; = 0 for at least k — 1 
values of i. 


Exercise 821 
Let œ be the endomorphisms of R? represented with respect to the canonical 


—8 -13 —-14 
basis by the matrix | —6 —5 —8 |. Does there exist a basis of R? with 
14 17 21 


respect to which «œ can be represented by a diagonal matrix? 


Exercise 822 


Let A=] _ € Max4(Q). Find the minimal polyno- 


mial A. 


Exercise 823 

cos*(t) cos(t) sin(t) 
cos(t) sin(t) sin? (t) 
that all of these matrices have the same characteristic and minimal polynomials. 


For each ż € R, set A(t) = € M2x2 (R). Show 


Exercise 824 

Let a, b, c € C. Find a necessary and sufficient condition for the minimal poly- 
2 0 0 

nomial of | a 2 0 | € M3x3(C) to be equal to (X — 1)(X — 2). 
b c 1 
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Exercise 825 


l a 0 
LetA=]a a 1 | € M3 3(R). Find the set of all real numbers a for which 
aa —-l 


the minimal and characteristic polynomials of A are equal. 


Exercise 826 

Let F = GF(5). For which values of a,b € F are the characteristic polynomial 
ab4 2 0 
bb b 3 3 

and minimal polynomial of the 5 x 5 matrix | 3 4 2b 1 3 | equal? What 
00 0 0 1 
0 0 0 3b 0 


if F = GF(7)? 


Exercise 827 
Let F be a field and let O 4 A € M3x3(F) be a matrix satisfying Ak = O for 
some positive integer k. Show that A? = O. 


Exercise 828 

Let F be a field and let A € M3,3(F) be a matrix which can be written in the 
form BC, where B and C are involutory matrices in M3x3(F). Show that A is 
nonsingular and similar to A~!. 


Exercise 829 
Let A e€ Mnxn(F) be written in the form A = P B, where P, B € Myyxn(F) and 
P is nonsingular. Show that A is similar to BP. 


Exercise 830 
Let A € M3x3(Q) be a matrix satisfying the condition that A> = I. Show that 
A=lI. 


Exercise 831 

Let n be a positive integer and let œ be an endomorphism of Myx» (C), consid- 
ered as a vector space over C, which satisfies the condition that æ (A) is nonsin- 
gular if and only if A is nonsingular. Show that œ is an automorphism. 


Exercise 832 

Let F be a field and let n be a positive integer. Let A € My xn(F) be a matrix 
having characteristic polynomial p(X) = X” + =, ci X'. Show that, for each 
k > n, we have Ak = pees b;(k)A/, where 

(1) bj(2) =—c; forallO< j <n—1,; 

(2) b_1(k) = 0 for all k > n; 

(3) bj (K+ 1) =bj-1(k) — ajby_1(k) for all k > n andallO< j <n—1. 
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Exercise 833 
Show that there is no matrix A € M2 2(R) satisfying the condition that 
—1 0 
Dice 
Ar = | 0 where o 1. 


Exercise 834 

Let V be a vector space finitely generated over C and let œ € End(V) be diago- 
nalizable. If W is a nontrivial subspace of V invariant under q, is the restriction 
of a to W necessarily diagonalizable? 


Exercise 835 

Let V be a vector space finitely generated over a field F and let æ € End(V). 
Show that œ is diagonalizable if and only if the sum of all of its eigenspaces 
equals V. 


Exercise 836 

Find all rational numbers a satisfying the condition the endomorphism of Q? 
1 0 0 

represented with respect to some basis by the matrix | 1 a 0 | is diagonaliz- 
0 0 1 

able. 


Exercise 837 
Give an example of an endomorphism œ of R? having nullity 2 which is not 
diagonalizable. 


Exercise 838 

Let n be a positive integer and let B € My xn (R) be a matrix all entries of which 
are positive. Let r > o(B). Show that 

(1) The matrix A =r/ — B is nonsingular; 

(2) All nondiagonal entries of A are nonpositive; 

(3) All entries of A`! are nonnegative; and 

(4) Ifa+ bi €C is an eigenvalue of A, then a > 0. 


Exercise 839 

Let V be a vector space of finite odd dimension over R and let a1,..., œg be 
distinct mutually-commuting endomorphisms of V, for some k > 1. Show that 
these endomorphisms have a common eigenvector. 


Exercise 840 
Let A € M3x3(Q) have characteristic polynomial X? — bX? + cX — d. For all 
n > 3, show that 


A” = tn—1 A + tn—2 adj(A) + (tn — btn—1)I, 


where tn = Pee): 
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Exercise 841 


a 2b a c 
; |b 2a |b d 4 
Show that the endomorphisms a : el hea and £: el? aa of Q 
d 2c d b 


are diagonalizable and commute. Find a basis of Q¢ relative to which both a and 
B are represented by diagonal matrices. 


Exercise 842 
Let A = [ajj] € My xn(R) be a Markov matrix all entries of which are positive. 
If c € C is an eigenvalue of A satisfying |c| = 1, show that c = 1. 


Exercise 843 
Let F be a finite field. Show that there exists a symmetric matrix in M2x2(F) 
having no eigenvalues. 


Exercise 844 
Does there exist a square matrix A over R which is not idempotent but satisfies 
the condition that spec(A) = {1}? 


Exercise 845 

Let n be a positive integer and let A € Myxn(Q) be a matrix all entries of which 
are integers. Let k be an integer which is an eigenvalue of A. Show |A| is and 
integer and that k divides |A]. 


Exercise 846 

Let V be a vector space of dimension 3 over a field F and let a € End(V) have 
nullity 2. Show that the characteristic polynomial of « is of the form X*(X — c) 
for some c € F. 


Krylov Subspaces 1 3 


Let V be a vector space over a field F and let a € End(V). If Oy 4 vo € V then the 
subspace F{vp, æ (vo), æ? (vo), ...} of V is called the Krylov subspace of V defined 
by a and vo. The elements of this subspace are precisely those vectors in V of the 
form p(a)(vo), where p(X) € F[X], and so it is natural to denote it by F[a]vo. It is 
clear that F[a@]vo is invariant under g. 


Alexei Nikolaevich Krylov was a Russian applied mathematician who 
at the end of the nineteenth century developed many of the methods 
mentioned here in connection with the solution of differential equa- 
tions. 


Proposition 13.1 Let V be a vector space over a field F, let a € End(V), 

and let Oy # vo E V. 

(1) F[æ]vo is the intersection of all subspaces of V containing vo and invari- 
ant under a; 

(2) vo is an eigenvector of a if and only if dim(F [a]vo) = 1. 


Proof (1) Since F[a]vg contains vo and invariant under q, it certainly contains the 
intersection of all such subspaces of V. Conversely, if W is a subspace of V which 
contains vo and invariant under a, then p(œ)(vo) € W for all p(X) € F[X] and so 
F[æ]vo € W. Thus we have the desired equality. 

(2) If vo is an eigenvector of œ associated with an eigenvalue c then for each 
D(X)= YS aj;X/ € F[X] we have p(œ)(vo) = ae, ajaj (vo) = ae ajc/vo 
€ Fug, proving that F[a]vp = Fvo and so dim(F[a]vp) = 1. Conversely, assume 
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that dim(F[a@]vp) = 1. Then F[a]vg = Fug since F vg is a one-dimensional sub- 
space of F[a]vo. In particular, æ (vo) € Fup and so there exists a scalar c such that 
æ(vo) = cvo, which proves that vo is an eigenvector of a. 


Since the set {vo, æ (vo), œ? (vo), ...} is a generating set for F[æ]vo over F, 
Proposition 13.1(2) suggests that dim(F[œ]vo) can be used to measure how far vo 
is from being an eigenvector of œ. 

As a first example of the use to which we can put Krylov subspaces, we 
will see how to use the minimal polynomial to solve systems of linear equa- 
tions. Let V be a vector space over a field F and let V° be the space of 
all infinite sequences of elements of V. Every polynomial p(X) = Fro ajX/ 
€ F[X] defines an endomorphism 6, of V% by Op : [U9, U1, ..-] => aoe, ajvj, 
Dio ajvj+1, YS ajvj+42,...]. Note that if p(X) = c is a polynomial of degree 
no greater than 0, then 0p = oc. It is also easy to verify that Opg = 0,0, = 049) for 
all p(X), g(X) € F[X]. 

A sequence y € V® is linearly recurrent if and only if there exists a polynomial 
p(X) € F[X] with y € ker(6,). In this case, we say that p(X) is a characteristic 
polynomial of y. If p(X) € F[X] is a characteristic polynomial of y € V° and if 
q(X) € F[X] is a characteristic polynomial of z € V° then @pq(y +z) = 649 p(y) + 
Oy0q(z) = [0, 0, ...] and so p(X)q(X) is a characteristic polynomial of y + z. It is 
also clear that p(X) is a characteristic polynomial of cy for all c e F. Thus we 
see that the set of all linearly recurrent sequences in V™ is a subspace of V™, 
which we will denote by LR(V). If y e LR(V), there is precisely one characteristic 
polynomial which is monic and of minimal degree. This polynomial will be called 
the minimal polynomial of y. The degree of the minimal polynomial of y is the 
order of recurrence of y. 


Linearly recurrent sequences in R® were considered by the 
seventeenth-century French-born mathematician Abraham de 
Moivre, who spent most of his life in exile in England and was one of 
the fathers of the theory of probability. 


Example Let F be a field and let n be a positive integer. If A E€ Mnxn(F), a poly- 
nomial p(X) € F[X] is a characteristic (resp., minimal) polynomial of the sequence 
[Z, A, A?,...] if and only if it is the characteristic (resp., minimal) polynomial of the 
matrix A. 


Example Let V = F = Q and let y = [ao,a1,...] E€ V™ be the sequence defined 
by ag = 0, ay = 1, and aj42 = aj41 + ai for all i > 0. This sequence is called 
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the Fibonacci sequence. Its minimal polynomial is X* — X — 1. The roots of this 
polynomial are 5(1 + 5). The number 5(1 + 5) is called the golden ratio and 
artists consider rectangles the sides of which are related by the golden ratio to be of 
high aesthetic value. This ratio—which appears in ancient Egyptian and Babylonian 
texts—appears in nature and is basic in the analysis of certain patterns of growth in 
nature (such as the spirals of a snail shell or a sunflower), of Greek architecture, of 
Renaissance painting, and even such modern designs as the ratio of the dimensions 
of a credit card or of A4 paper. Notice that X* — X — 1 is also the characteristic 


polynomial of the matrix f s] € M2x2(R), and so the eigenvalues of this ma- 


trix are also precisely 5(1 + ./5). The eigenspace associated with 5(1 + V5) is 
1 1 
R | atl H a and the eigenspace associated with 4(1 — /5) is R | au hi saa 


Leonardo Fibonacci was born in Italy in the 
twelfth century and educated in Tunis, bringing 
back the fruits of Arab mathematics to Europe. 
His book Liber Abaci, written in 1202, contained 
the first new mathematical research in Christian 
Europe in over 1000 years. In 1509, Fra Luca 
Pacioli, one of the most important Renaissance 
mathematicians, wrote a book, The Divine Pro- 
portion, illustrated by his friend Leonardo da 
Vinci, about the golden ratio. 


We note that if V = F and if y e LR(F) is a sequence having order of recur- 
rence at most n, then there exist algorithms, which are essentially extensions of the 
Euclidean algorithm, to calculate the coefficients of the minimal polynomial of y in 
an order of n? arithmetic operations in F. 

Now let V be a vector space of finite dimension n over a field F and let a 
be an automorphism of V having minimal polynomial p(X) € F[X]. If we V 
then the sequence y = [w,a(w), œ? (w), ...] belongs to ker(6,) and hence to 
LR(V). Therefore, this sequence has a minimal polynomial q (X) = ie cjX j, 
which divides the polynomial p(X) in F[X]. Since @ is an automorphism, we 
can assume that co Æ 0 and so we see that if u = =e" pe cja/—!(w) then 
a(u) = w and so u =a—!(w). In particular, if V = F” for some positive integer 
n and if œ is represented by a matrix A with respect to the canonical basis, then 
u=—Co ! eS cj Al -lw is the unique solution of the system of linear equations 
AX = w. If we set q* (X) = =c! Hi cjXJ7! then u = q*(A)w, and this could 
be computed quickly were we to already know q (X). 

How does one calculate g*(X) in practice? One method used is basically proba- 
bilistic: we randomly choose a vector u € F” and compute the minimal polynomial 
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qu(X) of y, = [u O w, u O (Aw),uO A?w),...] € F”, something which can be 
done, as we have already observed, in an order of n? arithmetic operations in F. 
After that, we check whether the minimal polynomial of y„ is also the minimal 
polynomial of y. In general, it will not be so, but it will divide the minimal polyno- 
mial of y and so after a reasonable number of such attempts we will, usually, have 
enough information on hand to reconstruct the minimal polynomial of y. 


14 4 3 
Example Let F = GF(5), let A=] 4 0 3 | €M3,3(F), and letw=]1]e 
12 4 2 


F?. The sequence w, Aw, Aw, ... looks like 


mw 
TE 
= 
X 
T 
== 


If we choose u = i we obtain the sequence y, = [3, 0, 4, 2,3,0, ...] in F% 
0 
and the minimal polynomial g,,(X) of this sequence equals X? + 2X + 2. Since 
qu(A)w = ; , we see that this polynomial is not the minimal polynomial of y. 
: 1 
We will try again with u = | 2 |. For this choice, we get y, = [0,1,2,2,3,2,...] 
0 


and this has minimal polynomial X? + 3X + 1. Since the minimal polynomial of y 

has to be a multiple of this polynomial, and has to be of degree 3, it must equal X? + 
0 

3X + 1 and, indeed, g,(A)w = | 0 |. Therefore, g*(X) = X? + 3 and g*(A)w = 
0 


qu(A)w = 


me Wh 


Now let us return to an important problem which was considered in the previous 
chapter. Let V be a vector space finitely generated over a field F. Given an endomotr- 
phism œ € End(V), how can we find a basis of V relative to which a is represented 
by a matrix which is as nice as possible? We have already found out when «œ is 
diagonalizable. But what if œ is not diagonalizable? Given a vector Oy 4 w € V, 
there exists a positive integer k such that the set {w,a(w),..., ak—!(w)} is lin- 
early independent but the set {w,a(w),.. .,a*(w)} is linearly dependent. Then 
{w,a(w),. ..,akl(w)} is a basis for the Krylov subspace F[a]w of V, which 
is called the canonical basis of this subspace. The restriction of a to F[a]w is 
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00... 00 a 
1 .. 0 0 © 
1... O O C3 
represented by a matrix of the form | . . o . with respect to 
00.. 10 Cr-1 
00... 01 cx 
the canonical basis, where the scalars c),..., cx, satisfy ak (w) = a ciai! (w). 


This, of course, is just the companion matrix of the polynomial X* — sa ext. 

Krylov subspaces are also the basis for a family of non-stationary iterative algo- 
rithms, known as Krylov algorithms, used for approximating solutions to systems of 
equations of the form AX = w, where A € Mpnxn(R) or A € Myxn(C). Similarly, 
Krylov subspaces are a basis for a family of non-stationary iterative algorithms, 
known as Lanczos algorithms, used for approximating eigenvalues of sufficiently- 
nice (e.g., symmetric) matrices. Such algorithms work even under the assumption 
that we don’t even have direct access to the entries of A but do have a “black box” 
ability to compute Av or A’ v for any given vector v € R”. Of course, they do 
not work for all matrices, but when they work they tend to be fairly efficient and 
rapid, and are especially good for large sparse matrices. Moreover, they are also 
amenable to implementation on parallel computers. Parallel Lanczos algorithms 
have also been developed for solving generalized eigenvalue problems, if the matri- 
ces involved are symmetric, Lanczos algorithms can be adapted to work for matrices 
over finite fields. However, in this case there are also other algorithms available. In 
particular, one should mention the Wiedemann algorithm to solve systems of linear 
equations of the form AX = w, where A is a large nonsingular matrix over a finite 
field. Such problems arise in the computation of discrete logarithms and in other 
modes of attack on various encryption methods for transmission of data over the in- 
ternet. They have also been used to factor large integers. The Wiedemann algorithm, 


a 


With kind permission of the Archives of the Mathematisches Forschungsinstitut Oberwolfach (Hestenes, 
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Hungarian-born applied mathematician Cornelius Lanczos developed many important nu- 
merical methods for computers in the period after World War II, while working at the US 
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which is based on computing the minimal polynomial of a certain linearly-recurrent 
sequence, works especially well for sparse matrices, and is amenable to parallel 
computation. 

A nonzero element a of an associative algebra (K, e) is nilpotent if and only if 
there exists a positive integer k satisfying a‘ = 0x. The smallest such integer k, if 
one exists, is called the index of nilpotence of a. In particular, if V is a vector space 
over a field F, then œ € End(V) is nilpotent if and only if there exists a positive 
integer k satisfying a = og. 


Example Let F be a field and let œ be the endomorphism of F? defined by 


a 0 
a: | b|| |a |. Then a is a nilpotent endomorphism, having index of nilpo- 
c b 
a —a +2b+c 
tence 3. The endomorphism £ of F? defined by £ : | b | > 0 isa 
c —a+2b+c 


nilpotent endomorphism, having index of nilpotence 2. 


Example Let F be a field and let œ and £ be the endomorphisms of F? defined 
bya: B = H and £ : H = Bi Both endomorphisms are nilpotent, but 
a+ É is clearly not nilpotent. 

If œ is a nilpotent endomorphism of a vector space V and w € V x ker(q@) then 


the restriction of a to F[a]w is represented with respect to the canonical basis of 
F[a]w by a matrix of the form 


000... 0 0 
100... 0 0 
0 10... 0 0 
0 0 0 0 0 
0 0 0 1 0 


Proposition 13.2 Let V be a vector space over a field F and let a be a 
nilpotent endomorphism of V having index of nilpotence k. Then there exists 
a vector w € V satisfying the condition that dim(F[a]w) = k. 


Proof We know that œ¥ = oo but not that there exists a vector Oy # w € V such 
that a*—!(w) Æ Oy. We will have proven the theorem should we are able to show 
that the set {w,a(w),..., ak! (w)} is linearly independent. And, indeed, assume 
that we have scalars aọ, ...,ag—ı € F satisfying = ajat (w) = Oy. Let t be the 
smallest index such that a; 4 0. Then if we apply the endomorphism a*~'—! to 
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+s aiai (w) we get Oy = arat! (w) + ayy,a*(w) +--+ + ag_2a?*-*-? (w) and 
so a, = 0, which is a contradiction. Therefore, we conclude that a; = 0 for all i, and 
so the set is linearly independent, as required. 


In particular, we see that if V is a vector space of finite dimension over a field F 
and if œ is a nilpotent endomorphism of V, then the index of nilpotence of œ is no 
greater than dim(V). 


Proposition 13.3 Let V be a vector space finitely generated over a field F 
and let a be a nilpotent endomorphism of V having index of nilpotence k. If 
w E V satisfies the condition that dim(F [a]w) = k then the subspace F(a]w 
of V has a complement in V which is invariant under a. 


Proof We will proceed by induction on k. If k = 1 then œ = o9 and so F[æ]w = 
Fw. Then there is a subset B of V \ {Fw} such that B U {w} is a basis for V, and B 
is a basis for a complement of Fw in V. Assume that k > 1 and that the result has 
been established for any vector space finitely generated over F and any nilpotent 
endomorphism of that space having index of nilpotence less than k. 

We know that im(q@) is invariant under œ and that the restriction of œ to 
im(@) is nilpotent, having index of nilpotence k — 1. We know that the set 
{w,a(w),...,a*-!(w)} forms a basis for F[a]w and so the set {a(w),..., 
a*—!(w)} forms a basis for the image U of F[a]w under a. Therefore, U = 
F[a]a(w) is a subspace of im(a) and, by the induction hypothesis, it has a comple- 
ment Wz in im(q@) invariant under œ. 

Let Wo = {v € V | a(v) € W2}. This is a subspace of V containing W2, since W2 
is invariant under a. But a(v) € W2 C Wo for all v € Wo and so Wo is also invariant 
under a. Our first assertion is that V = F[a]w + Wo. And, indeed, if x € V then 
a(x) €im(a) = U ® W2 and so a(x) =u + w2, where u € U and w2 € W2. But 
u =a(y) for some y € F[a]w and x = y+ (x — y). The first summand belongs to 
F[a]w, whereas, as to the second, we have a(x — y) = æ (x) —a(y) = a (x) — u = 
w2 € W2 and so x — y € Wọ, proving the assertion. 

Our second assertion is that F[a]w M Wo C U. Indeed, if x € F[æ]w N Wo 
then a(x) € UN W2 = {0y} and so x € ker(q@). Singe x € F[a]w, we know that 
there exist scalars do,...,da,%—, such that x = = Ei =0 aia! (w) and hence Oy = 
a(x) = DE =) a; ait! (w), which implies that ay = = aķ—2 = 0. Therefore, x = 
ay jak Tew) € U, proving the second assertion. 

In particular, from what we have seen. we deduce that the subspaces W2 and 
Fla]w N Wo are disjoint. Therefore, W2 ® (F[a]w N Wo) is a subspace of Wo. 
This subspace has a complement W; in Wo. Thus we have Wo = Wi @ W2 ® 
(Fla]w N Wo). 

Our third assertion is that W = W; ® W2 is a complement of F[a]w in V which 
is invariant under œ, and should we prove this, we will have proven the proposition. 
Indeed, we immediately note that a(W) C a(Wo) E W2 C W and so W is surely 
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invariant under a. Moreover, F[a]wM W = {0y} since this subspace is contained 
in the intersection of W and F[a]w N Wo, which, by the choice of W, equals {Oy}. 
Finally, 


V = Flalw + Wo = Flalw + [W1 + W2 + (Fla]wn Wo) | 
= Flajw+Ww,+W2=Flalw+ W, 


and so V = F[a]w 8 W. 


Proposition 13.4 (Rational Decomposition Theorem) Let V be a vector 
space of finite dimension n over a field F let œ be a nilpotent endomorphism 
of V having index of nilpotence k. Then there exist natural numbers k = kı > 


<- > k; satisfying kj +--+ kı =n, and there exist vectors v1, ..., v, in V 
such that {v1, a(vı), ests a171 (v1), v2, a(v2), tety a-l (v), sees Ut, a(vr), 
.., a! (v)} forms a basis for V. The matrix which represents a with re- 
A O ... O 
A ... O 
spect to this basis is of the form - 2 , Where each A; is of 
O O .. A; 
00...00 
10... 0 0 
the form O 1... 0 0} in Mi; xk; (F). 
0 0 1 0 


Proof Choose kı = k and choose vı ¢ ker(æ*!7!). Then U; = F[a]v, has a basis 
{v1 @(v1),..., ati! (v)}. It is invariant under œ and of dimension kı. By Propo- 
sition 13.3, V = U; ® Wı, where W; is a subspace of V invariant under a. The 
restriction of œ to W is a nilpotent endomorphism of W; with index of nilpotence 
k2 < kı. We now repeat the above procedure for W1. Pick v2 € Wi N ker(a®27!), 
Then U2 = F[a]v2 of Wı has a basis {v2, a(v2),..., a2! (v)}. It is invariant un- 
der a and of dimension k2. Moreover, we can write Wy = U2 ® W2, where W2 
is invariant under œ. Continuing in this manner, we end up with a decomposition 
V = U ® --- ® U,, where each U; is a subspace of V invariant under a having a 
basis of the form {v;, a(v;),..., aki-! (v;)} as above. This proves the first contention 
of the proposition. The second one follows since U; = F[a]v; for all i, which leads 
to a matrix of the desired form. 


A matrix of the form given in Proposition 13.4 is called a representation of the 
nilpotent endomorphism « in Jordan canonical form. Let V be a vector space over 
a field F and let œ be an endomorphism of V having an eigenvalue c. A vector 
Oy Ave V is a generalized eigenvector of a associated with c of degree k > 0 
if and only if v is in ker((a — co,)*) \ ker((w — coy)*—!). Thus, in particular, the 


13 Krylov Subspaces 305 


eigenvectors of œ associated with c, in the previous sense, are just the generalized 
eigenvectors of a of degree 1 associated with c. 


The nineteenth-century French mathematician Camille Jordan made 
major contributions to linear algebra, group theory, the theory of finite 
fields, and the beginnings of topology. 


Example Let a be the endomorphism of R* represented with respect to the canon- 


2-211 
ical basis by the matrix 0 i : . This endomorphism has an eigenvector 
0 002 
2 1 
a associated with the eigenvalue 1 and an eigenvector : associated with the 
0 0 
0 
eigenvalue 2. It also has a generalized eigenvector = of degree 2 associated 
0 
0 
with the eigenvalue 2 and a generalized eigenvector E of degree 3 associated 
—1 


with the eigenvalue 2. 


We now prove a generalization of Proposition 12.1. 


Proposition 13.5 Let V be a vector space over a field F and let a € End(V) 
have an eigenvalue c. Then the set of all generalized eigenvectors of æ (of all 
degrees) associated with c, together with Oy, forms a subspace of V which is 
invariant under any endomorphism of V which commutes with a. 


Proof Let a € F and let v,w € V be generalized eigenvectors of a associated 
with c, of degrees k and h, respectively. Then both v and w belong to ker(œ — 
co,)"t* and hence the same is true for v + w and av. This means that there exist 
positive integers s, t < h +k such that v + w € ker((@ — co1)*) \ ker((a — coy )°~!) 
and av € ker((a — coy)") \ ker((@ — coy)'~!), proving that we have a subspace. 
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If 6 is an endomorphism of V which commutes with œ and if v is a gener- 
alized eigenvector of œ associated with c such that v € ker((@ — co,)*), then 
(a — co1)* B(v) = B(a- co1)*(v) = Oy so B(v) is also a generalized eigenvector 
of œ associated with c, proving invariance. 


Let V be a vector space over a field F and let a € End(V) have an eigenvalue c. 
The subspace of V defined in Proposition 13.5 is called the generalized eigenspace 
of œ associated with c. 


Proposition 13.6 Let V be a vector space over a field F and let a € End(V) 
have an eigenvalue c. Let v be a generalized eigenvector of degree k associ- 
ated with c. Then the set of vectors {v, (æ — co1)(v),..., (& — co,)*7} (v)} is 
linearly independent. 


Proof Set B = a — co, and, for each 1 < j < k, let vj = BTI (w). Assume 
that there exist scalars cj,...,cx € F satisfying Xi cjvj = Oy. Then Oy = 
PELE cju) = BE (crve) = kb! (wk) and so, since 6%! (v) # Oy, we 
conclude that cg = 0. We work backwards in this manner to see that c; = 0 for all 
1 < j < k, and so the given set is linearly independent. 


In particular, let V be a vector space of finite dimension n over a field F and 
let a € End(V). If v is a generalized eigenvector of œ of degree k associated to an 
eigenvalue c of a, then we must have k < n. Thus we see that dim(V) is an upper 
bound to the degree of generalized eigenvalue of œ and we see that the generalized 
eigenspace of a associated to an eigenvalue c is just ker((@ — co )"). 


Proposition 13.7 Let V be a vector space of finite dimensionn over a field 

F and let a € End(V) satisfy the condition that the characteristic polyno- 

mial p(X) of a is completely reducible, say p(X) = = (X — cj)", where 

spec(a@) = {c],..., Cm}. Then there exist subspaces Uj, ..., Um of V , each of 

which invariant under a, such that: 

(d) V=U,8---®Un; 

(2) dim(U;,) = np for each 1 < h < m, 

(3) For each 1 < h < m, the restriction of œ to Uj, is of the form Chth + Bn, 
where Bry € End(U;) is nilpotent and Tp is the restriction of o1 to Up. 


Proof For each 1 < h < m, consider the endomorphism fp = œ — cho; of V, and 
let U;, be the generalized eigenspace of œ associated with c}. Then Up is a subspace 
of V invariant under p and also invariant under œ since for all v € Up we have 
By a(v) = aß; (v) = a (0y) = Oy. We claim that there exists a positive integer k, 
independent of h, such that all elements of Up are generalized eigenvectors of a of 


13 Krylov Subspaces 307 


degree at most k. Indeed, we see that ker(6;,) € ker(B?) (æ ker(B?) C... and since 
V is finitely-generated, there are at most a finite number of proper containments. 
Thus there exists a k such that ker(B*) = ker(8%+!) =.---. From here it is clear that 
ker(ph ) = Uh, proving the claim. 

In particular, this claim shows that the restriction of By, to Up is a nilpotent endo- 
morphism having index of nilpotence k. More than that, the restriction of a to Up 
equals cpt; + n, proving (3). We now notice that if t Æ h then U; is invariant under 
Bn. We claim that the restriction of p to U, is an automorphism. Since U; is finite- 
dimensional, it is sufficient to prove that it is a monomorphism. Indeed, suppose that 
v € U; N ker(n). Then there exists a positive integer k such that pw) = Oy and so 
Ov = pk (w) = [Ba + (ch — cr) (v) = (ch — cr)‘ (v) and, since ch — cr Æ 0, we must 
have v = Oy, proving the claim. 

The next step is to show that the collection {U),..., Um} of subspaces of V is 
independent. Indeed, let 1 < h < m and let Y = Up N È jgn U;. Then Y is a sub- 
space of V invariant under 6, on which pn is monic (since Y C È jth U;) and 
nilpotent (since Y C Up), which is possible only if Y = {Oy}. This proves inde- 
pendence, and we will set U = U1 ®--- ® Um. We want to show that U = V. 
Let v e U. By the Cayley—Hamilton Theorem (Proposition 12.16), we see that 
a annihilates its characteristic polynomial p(X) and so [J [}-; B;'](v) = Oy € U. 
Suppose that 8;' (v) € U, say that it is equal to )°”,u;, where un € Un for all 
l] <h <m. Since Be is epic when restricted to Up, for each h ~ 1, we can 
find an elements wy, of U, for each 1 < h < m, such that up, = Bf! (wpn) for all 
such A. Therefore, B;'!(u — $C} wn) = u1 € Uj. By definition of Uj, it follows 
that wı = u — “7, Wn € U1. Therefore, v = )77'_, wn € U. If, on the other hand, 
Br! (v) ¢ U, then let t be the smallest element of {2, . . . , m} satisfying the condition 
that [T]; B;'\(v) € U and (i; B;'1(v) ¢ U. A similar argument to the preced- 
ing then shows that we must have v € U. 

We are left to show that dim(U;,) = np for all 1 < h < m. Pick a basis for V 
which is a union of bases of the Up. With respect to this basis, the endomorphism 


A O ... O 
A ... O 

a is represented by a matrix of the form Bf , where each Ap 
O O ... Am 


is a matrix representing the restriction of œ to Up. By Proposition 11.12, the char- 
acteristic polynomial of œ is therefore of the form |X/J — A| = Į [j1 |XJ — Anl. 
From this decomposition and from the fact that each 6, restricts to an automor- 
phism of U; for all t Æ h, it follows that the only eigenvalue of the restriction of 
a to Up is cp, and the algebraic multiplicity of this eigenvalue is at most ny. Since 
Xh dim(Un) = X}; na, it then follows that dim(U;) = np for each h. 


Proposition 13.7 shows that when conditions are right—for example, when the 
field F is algebraically closed—and when we are given an endomorphism a@ of a 
finite-dimensional vector space V, it is possible to choose a basis for V relative to 
which g is represented in a particularly simple form. We do this in two steps. 
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(ID) Write V as a direct sum U; @--- ® Um as above. By choosing a basis for V 
which is a union of bases of the Un, we get a matrix representing œ composed 
of blocks strung out along the main diagonal, each representing the restriction 
of a to one of the subspaces Up. 

(II) For each h, we have a = ChTp + Bn, where By is a nilpotent endomorphism 
of Up. We now choose a basis of Up relative to which By, is represented in 
Jordan canonical form. 


Thus, in the end, we have a representation of œ by a matrix of the form 
A O ... O 


O A.. 0 
: . ; , where each block A, is a matrix with blocks of the form 
O O ss Am 
Ch 0 (0) sss (0) 
1 chp 0 (0) 
l ch 0 | on its diagonal (these may be 1 x 1!) and all other en- 
0 oO... 1 ce 


tries equal to 0. A matrix of this form is called the Jordan canonical form of a. By 
Proposition 13.7, we see that if V is a vector space finitely generated over a field 
F and if œ is an endomorphism of V having a completely reducible characteristic 
polynomial in F[X], then there is a basis of V relative to which « can be represented 
by a matrix in Jordan canonical form. Thus, this can always be done if the field F 
is algebraically closed. If F is not algebraically closed then it is always possible to 
extend the field F to a larger field K such that the characteristic polynomial of œ is 
completely reducible in K[X]. 


Example Consider œ € End(R*) represented with respect to the canonical basis 


O 1 0 0 
0 0 1 0 a . ee 3 
by A= 0 0 oal The characteristic polynomial of A is X* — 4X? + 
-1 4 -6 4 
6X? — 4X + 1 = (X — 1)4 and so its only eigenvalue is 1. Then A is similar 
1 0 0 0 
1 1 0 0]. ; -1 
to B= 0110 in Jordan canonical form. Indeed, B = PAP", where 
001 1 
—1 = 
P= 


3 
1 —2 
-1 1 
0 


oor w 
ooor 
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Example Consider œ € End(R*) represented with respect to the canonical basis 
3 00 00 


0 4 2 -1 4 
by A=|0 0 2 O OJ. The characteristic polynomial of A is (X — 3) - 
0 1 3 2 1 
000 0 2 
20 0 0 0 
0 3 0 0 0 
(X — 2)? and A is similartoB=|0 1 3 O O0 | in Jordan canonical form. In- 
000 2 0 
000 0 3 
deed, 
0 0 -3 0 —4 
0 1 -1 -l1 3 
B= PAP™!, where P=| 2 1 3 0 1 
00 0 0 =l 
-1 0 0 0 0 


Example Consider œ € End(C*) which is represented with respect to the canon- 


0 0 2i 0 
ical basis by A = l TA . The characteristic polynomial of A 
—2i 0 0 0 
0 -2i 1 0 
=2 0 0 0 
: 2 2 fe n 1 —2 0 0 
is (X — 2) (X + 2) and A is similar to B = . Indeed, 
0 020 
0 0 1 2 
1 0 —i 0 
> si 1O 1 0 -i 
B= PAP ‚where P= 7 10 i 0 
0 1 0 i 


We now use Jordan canonical forms to prove a result interesting in its own right. 


Proposition 13.8 Let n be a positive integer and let A € Mnxn (F), where F 
is an algebraically-closed field. Then A can be written as a product of two 
symmetric matrices. 


Proof By Proposition 13.7, we know that A is similar to a matrix B in Jordan 
canonical form. In other words, there exists a nonsingular matrix Q € Myxn(F) 
satisfying A = Q B Q7!. If we can write B = C D, where both C and D are symmet- 
ric, then A = QBDQ-! =(QCQ")((Q")"'DQ™') = (QCQ")(Q"')"DQ""), 
where both QC QT and (Q7!)T DQ™! are symmetric. Therefore, without loss of 
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generality, we can assume that A is in Jordan canonical form, say 


A O ... O 

O A ... O 
A=]. n, Ns 

O O Am 


where each block A; is of the form 


an 0 0O ... O 

1 a, O ... 0 

0 1 ah are 0 E Mn, xn, (F). 
0 O ... 1 ah 


Define the matrix Dan € Mn, xn, (F) to be [d;;], where 


q af) fit iam 
~~ |O otherwise. 


Then Dp is a symmetric matrix satisfying D, Me Dp. Moreover, the matrix 


D) O ... O 
O D ... O 

D = : : a : €EMaixn(F) 
O O sa- Dn 


is also symmetric and satisfies D7! = D. Furthermore, the matrix 


AıDı O O 
O A2D2 ... O 
C= : : : €Maixn(F) 
O O ... AmDmn 


is also symmetric and A = CD, as required. 


Another interesting result is the following. 


Proposition 13.9 If A € Mnxn(F), where F is an algebraically-closed field, 


then A is similar to its transpose. 


Proof Since F is algebraically closed, we know that A is similar to a matrix B in 
Jordan canonical form, and to show that B is similar to its transpose it suffices to 
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show that each Jordan block is similar to its transpose. This is surely true of blocks 


c 0 0 .. 0 
l c 0O .. 0 

of size 1 x 1. If C = 0 loc... 0 is a Jordan block of size h x h for 
00... 1 c 


h > 1, then we note that PCP = C7, where P = [ pij] is the involutory matrix 
defined by the condition that 


fl ifi+j=n+1, 
Pij =) 0 otherwise 


thus proving the result. 


Contemporary American mathematician Richard Brualdi and Chinese mathe- 
maticians Pei Pei and Xingzhi Zhan have shown that the Jordan canonical form 
of a matrix in M,,x,(C) is the best one can get in terms of sparseness, namely they 
proved that among all the matrices that are similar to a given matrix in My x»(C), 
the Jordan canonical form has the greatest number of off-diagonal zero entries. 


Exercises 


Exercise 847 
Find endomorphisms a and 8 of R? satisfying the condition that wf is not nilpo- 
tent but cæ + df is nilpotent for all c,d € R. 


Exercise 848 
Let V be a vector space over a field F and let œ € End(V) be nilpotent, having 
index of nilpotence k > 0. Show that o} +a@ € Aut(V). 


Exercise 849 
Let V be a vector space finitely-generated over C. Do there exist endomorphisms 
a and £ of V satisfying the condition that o; + wf — Ba is nilpotent? 


Exercise 850 
Let F a field and let B be a given basis of F°. Let a € F and let œ be the 


—a a a 
endomorphism of F? satisfying ®g g(a) = 0 O 0 |. For which values of 
—a a a 


a is this endomorphism nilpotent? 


Exercise 851 
Let F be a field. Give an example of a nilpotent endomorphism of F’ having 
index of nilpotence 3. 
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Exercise 852 

Let a be the endomorphism of V = R* represented with respect to a basis B of 
2 —8 12 —60 
2 -5 9 —48 
6 —17 29 —152 
1 -3 5 —26 
index of nilpotence. 


V by the matrix . Show that œ is nilpotent and find its 


Exercise 853 
Let V be a vector space over a field F and let a € End(V) be nilpotent. Does Ba 
have to be nilpotent for all 6 € End(V)? 


Exercise 854 
Let V be a vector space over a field F and let a € End(V) be nilpotent. Find 


spec(a). 


Exercise 855 
Let V be a vector space finitely generated over C and let œ € End(V) satisfy 
spec(a) = {0}. Show that «œ is nilpotent. 


Exercise 856 
Let V be a vector space over a field F and let œ € End(V) be nilpotent, having 
index of nilpotence k. Find the minimal polynomial of a. 


Exercise 857 

Let V be a vector space finitely generated over a field F and let a € End(V) 
satisfy the condition that for each v € V there exists a positive integer n(v) sat- 
isfying a”) (v) = Oy. Show that æ is nilpotent. Does the same result hold if V 
is not assumed to be finitely generated over F? 


Exercise 858 
Let F be a field and let œ be an endomorphism of F? represented with respect to 


0 a 0 
a basis B of F? by the matrix | 0 0 b |, where a and b are nonzero scalars. 
0 0 0 


Does there exist a endomorphism £ of F? satisfying B* = a? 


Exercise 859 

Let œ be a nilpotent endomorphism of a vector space V over a field F having 
characteristic 0. Show that there exists an endomorphism £ of V belonging to 
F[a] and satisfying 8? = 0, + a. 
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Exercise 860 


Let œ € End(R?) be represented with respect to the canonical basis by the matrix 
1 2 -2 1 
3 0 3 |. Calculate R[a] | 0 
1 1 -2 0 


Exercise 861 
Let V be the space of all infinitely-differentiable functions from R to itself. Let 
ô be the endomorphism of V which assigns to each function its derivative. What 


is R[8] sin(x)? 


Exercise 862 
a a+c 1 
Define w € End(R?) by a: | b | > | b—a |. Find R[a] | 1 
c b 1 
Exercise 863 


Let V be a vector space over a field F and let a € End(V). Let v € V be a vector 


satisfying F[a?]v = V. Show that F[a]v = V. 


Exercise 864 
Given a € R, let aq € End(R*) be represented with respect to the canonical basis 
0 a 1 0 
i 1 —2 1 1 i : i . 
by the matrix 0 0 4 ol For which values of a is the dimension of 
(0) 1 0 -2 
0 
(0) 
R[æa] 0 equal to 3? 
—1 


Exercise 865 


Let œ € End (R?) be represented with respect to the canonical basis by the matrix 
1 1 1 
O 1 O |. Find the eigenvalues of œ and the generalized eigenspace associ- 
0 0 2 

ated with each. 
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Exercise 866 

Let V be a vector space finitely generated over C and let œ € End(V). Show 
that œ is diagonalizable if and only if every generalized eigenvector of œ is an 
eigenvector of a. 


Exercise 867 
Let B = {v1, v2, v3} be the canonical basis of R? and let a be the endomor- 


1 0 2 
phism of R3 satisfying ®gg(a)=}0 1 O |. Show that W = R{v1, v3} and 
0 0 1 


Y = Rvp are complements of each other in RÌ and that each of these spaces is 
invariant under œ. 


Exercise 868 
1 2 0 
Let A=] 0 2 0 | e M3 3(R). Find the Jordan canonical form of A. 
2 —2 -l1 
Exercise 869 
—2 8 6 
Le A=|—4 10 6 | € M3x3(R). Find the Jordan canonical form of A. 
4 -8 —4 
Exercise 870 
0 a —b 
Let O 4 A € M3x3(C) be of the form | —a 0 c |, where a, b, and c are 
b =c 0 


real numbers. What is the Jordan canonical form of A? 


Exercise 871 
Let A € M5 x5(Q) be a matrix in Jordan canonical form having minimal poly- 
nomial (X — 3). What does A look like? 


Exercise 872 
Give an example of a matrix in M4 ,4(R) which is not similar to a matrix in 
Jordan canonical form. 


Exercise 873 

Let V be a vector space finitely generated over a field F and let a and £ be 
nilpotent endomorphisms of V represented with respect to some given basis by 
matrices A and B, respectively. If the matrices A and B are similar, does the 
index of nilpotence of a have to equal that of 6? 
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Exercise 874 


a 0 0 
Let F be a field and let A= | 1 a 0| e M3 3(F). Show that 
0 1 a 
ak 0 o0 
Ak = kak—! ak 0 | for allk > 0. 


sk(k— ak? kak! ak 


Exercise 875 

Let n be a positive integer and, for all 1 <i, j < n, let pjj(X) € C[X]. Let 
ge: C —>Mnxn(C) be the function defined by gy : z + [p;;(z)]. Furthermore, 
let us assume that g(z) is nonsingular for each z € C. Show that there exists 
a nonzero complex number d such that |g(z)| = d for all z € C. 


Exercise 876 
For each t € C, let œ; be the endomorphism of CÌ represented with respect to the 


0 1 0 
canonical basis by the matrix | 0 0 0 |. Is the representation of a; in Jordan 
t 0 0 
canonical form dependent on t? 
Exercise 877 
0 0 4 
LettA=|]0 0 O] €M3x3(C). Find the set of all c € C satisfying the condi- 
0 0 0 
tion that cA is similar to A. 
Exercise 878 
1 -1 1 0 
Find the Jordan canonical form of A = 5 1 1 0 M3x3(R). 
0 00 


Exercise 879 
Let a € R be positive. Find the Jordan canonical form of 


a 0 0 1 
0 1 1 0 
0110€ M4x4(R). 
0 0 0 1 


Exercise 880 
Let A € Mnxn(R) differ from J and O. If A is idempotent, show that its Jordan 
canonical form is a diagonal matrix. 
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Exercise 881 
Let 1 4 A E€ Myxn(R) be an involutory matrix. Show that the Jordan canonical 
form of A is a diagonal matrix. 


The Dual Space 1 4 


Let V be a vector space over a field F. A linear transformation from V to F (consid- 
ered as a vector space over itself) is a linear functional on V. The space Hom(V, F) 
of all such linear functionals is called the dual space of V and will be denoted by 
D(V). Note that D(V) is a vector space over F, the identity element of which for 
addition is the 0-functional, v +> 0. Since dim(F’) = 1, we immediately see that 
every linear functional other than the 0-functional must be an epimorphism. 


With kind permission of the Archives of the Mathematisches 
Forschungsinstitut Oberwolfach. 

Linear functionals were first studied systematically 
by the French mathematician Jacques Hadamard, 
whose long life ranged from the mid nineteenth 
century to the mid twentieth century, and by his 
student Maurice Fréchet. Their work on function- 
als turned them into a major tool in analysis. 


Example Let F bea field and let n be a positive integer. Any v € F” defines a linear 
functional in D(F”) bywRvOw. 


Example Let V be a vector space over a field F and let B be a basis for V. Each 
u € B defines a function f, € F? defined by 


pws 1 ifu =u, 
ma O otherwise, 

and by Proposition 6.2 we know that this function in turn defines a linear functional 
ôu € D(V). In particular, if V = F” and if B = {u1, ..., un} is the canonical basis 

al 
for V, then dy, : | : | +> an for each 1 <h <n. 

dn 
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Example Suppose that V = C(a, b) and that go € V. Then the function 7: V > R 
defined by n: f => f? f(x)go(x)dx belongs to D(V). Hadamard’s initial work on 


linear functionals concerned those of the form f b> limy_so J FS (x) gn (x) dx for 
suitable sequences g1, g2,...in V. 


Example Let V be the subspace of RÈ consisting of all infinitely-differentiable 
functions f satisfying the condition that there exist real numbers a < b such that 
f(x) =0 if x ¢ [a, b]. Then the function f => fie f(x) dx belongs to D(V). El- 
ements of D(V) are known as distributions and play an important role in analysis 
and theoretical physics. 


Note that the linear functional tr: Mnxn(F) —> F is not a homomorphism of 
F-algebras whenever n > 1. If (K, e) is an algebra over a field F then a nonzero 
linear functional ô € D(K) which is also a homomorphism of F-algebras is called 
a weight function on K and an algebra having a weight function is called a baric 
algebra.' Nonassociative baric algebras are an important context for mathematical 
models in genetics. 

Let F be a field and let n be a positive integer. Then there exists a linear 
functional tr: Myxn(F) —> F which assigns to each matrix the sum of the ele- 
ments of its diagonal, i.e., tr : [ajj] => ye 1 aii. This linear functional is called the 
trace. This functional will play an important part in our later discussion. Note that 
v Ow = tr(v A w) for all v, w € F”. 

If A =[a;;] and B = [b;j] are matrices in Mpxn(F), then it is easy to see that 
tr(AB) = } 1 } p=] Ginbni = tr(B A). We also notice that tr(7) = n, where T is 
the identity matrix of Mnxn(F). If the characteristic of the field F does not divide n, 
we claim that these conditions uniquely characterize the trace. 


With kind permission of the Clarke University Archives. 


The trace of a matrix was first defined by the nineteenth-century 
American mathematician Henry Taber. 


Proposition 14.1 Letn be a positive integer and let F be a field the charac- 
teristic of which does not divide n. Let 8 be a linear functional on Mnxn(F) 
satisfying the conditions that 6(AB) = 6(BA) for all A, B € Mnxn(F) and 
that 6(1) =n. Then 6 = tr. 


! Such structures were first studied by the twentieth-century Scottish mathematician I.M.H. Ether- 
ington, who formulated the Mendelian laws algebraically. 
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Proof If (Hi; | 1 <i, j <n} is the canonical basis of My xn(F), then it suffices to 
show that 6(H;;) = tr(Hj;) for all 1 <i, j < n. In particular, if 1 <i, j < n then 
ô(Hii) = 6 (Ai; Hji) = 6 (Aji Hij) = 6(Hj;). Since J] = Dia H;i, this implies that 
n=ô(I)= Xai ô(Hii) and so ô (Hii) = 1 = tr( H;i) for all 1 <i < n. If i # j then 
Hij Hi = O and so 5 (Ajj) = 6(Hj1 Mj) = 6( Aj j Hii) = 5(O) =0= tr(H;j), and 
we are done. 


By the above, we see that if F is a field, if n is a positive integer, and if A, B € 
Mnxn(F), then tr(A e B) = tr(A B) — tr(B A) = 0, where e denotes the Lie product 
on My xn(F). In fact, over fields of characteristic 0 the converse is also true. In 
order to establish this fact, we first need a technical result. 


Proposition 14.2 Let F be a field of characteristic 0 and let n be a positive 
integer. Let A E€ My yn(F) have the property that it is similar to no matrix in 
Maxn(F) having a 0 for its (1, 1)-entry. Then A is a scalar matrix. 


Proof Clearly, A is not O and so there exists a vector w € F” satisfying 
0 
ATw £ : |. Assume that there exists a vector v € F” satisfying the condition 


0 
that w © v = 1 and (A? w) © v = 0. Let 5 € D(F”) be given by ô: ye wOy. 
Then the nullity of ô is n — 1 and we can pick a basis {y2,..., yn} for ker(S). Since 
v ¢ ker(6), we see that the set {v, y2,..., Yn} is linearly independent. Therefore, the 
matrix P the columns of which are v, y2,..., Yn is nonsingular, and wT is the first 
row of P—!. Moreover, the (1, 1)-entry of the matrix P-'AP is (A’w) Ov=0, 
contradicting the assumption on A. This means that there is no vector v satisfying 
the given conditions and so AT w = cyw for some scalar cy € F. Thus we conclude 
that if w is any vector in F” then A’ w is either the 0-vector or a scalar multiple of 
w and so, for any nonsingular matrix Q € Mnxn(F) we see that Q-'AO = [bij] 
is a diagonal matrix. If bpn A bkg and if y is the difference between the Ath and kth 
rows of Q~!, then AT y cannot be of the form cyy, which is again a contradiction. 
Thus A must in fact be a scalar matrix. 


Proposition 14.3 Let F be a field of characteristic 0, let n be a positive inte- 
ger, and let C € Mnxn( EF). Then tr(C) = 0 if and only if C is the Lie product 
of matrices A, BE Myxn(F). 


Proof We have already seen that if C is the Lie product of two matrices in 
Mnaxn(F) then tr(C) = 0. We will prove the converse by induction on n. The re- 
sult is clearly true if n = 1 so we can assume, inductively, that n > 1 and that the 
result has been established for matrices in Mgxk(F) for any k < n. Moreover, if 
C = O, take A = B = O and we are done. Hence we can assume that C 4 O. 
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Since F has characteristic 0 and tr(C) = 0, we know that C is not a scalar ma- 
trix. By Proposition 14.2, this means that there is a nonsingular matrix P such 
that P~'CP has a 0 for its (1, 1)-entry. That is to say, we can write P~'CP in 


T 
block form as f 4 | where x,z € F”! and C' € Mn-1)x(n-1)(F). More- 


C’ 
over, tr(D) = tr(P~'C P) = tr(C) = 0 and so, by the induction hypothesis, there 
exist matrices A’, BY € Mi—1)x(n—-1)(F) satisfying C’ = A’B’ — B'A’. If A’ is 
singular, then we can replace A’ by A’ — c'I for any scalar c’ ¢ spec(A’) (and 
such an element c’ exists since F has characteristic 0 and hence is infinite). 
Therefore, without loss of generality, we can assume that A’ is nonsingular. Then 
T a/-l 
P-!CP = A" B” — B" A", where A” = l A and B” = ai as i | 
Thus C = (PA" PT!) (P B" P7!) — (P B" P7')(P A" P7). 


The first of many proofs of this result was given by Kenjiro Shoda, 
one of the major figures in the twentieth-century Japanese algebra. 
The proof given here is due to Kahan. 


Example Note that this result may be false if the field F has positive characteristic. 


For example, if F = GF(2) then tr ik : 


| = 0 but there are no matrices A and 


B in M?2x2(F) satisfying AB — BA = i a! 


Thus, if F has characteristic 0 then the set of all matrices C € Myxn(F) satisfy- 
ing tr(C) = 0 forms a subalgebra of the general Lie algebra My,y.,(F)~, called the 
special Lie algebra defined by F”. 

If n is a positive integer, F is a field, and P is a nonsingular matrix in Myxn(F), 
then tr(PAPT!) = tr(PP~'A) = tr(A) and so similar matrices have identical 
traces. In general, if B and C are fixed matrices in Mpnxn(F) then the functions 
At tr(BA) and Ab tr(AC) belong to D(Myyn(F)). 

The following result shows that traces essentially define all linear functionals on 
spaces of square matrices. 


Proposition 14.4 Let F be a field, let n be a positive integer, and let 
ô € D(Mnxn(F)). Then there exists a matrix C € Myxn(F) satisfying 
ô: At tr(AC) forall AE Mnxn (F). 
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Proof For each 1 <i, j <n, let Hj; be the matrix having (i, j)-entry equal to 1 and 
all of the other entries equal to 0. Then we know that the set of all such matrices 
is a basis for Mnxn(F). Let C = [cij] € Mnxn(F) be the matrix defined by cij = 
ô(Hj;) for all 1 <i, j < n. Then for each matrix A = [ajj] E€ Mnxn(F) we have 
8(A) = 800 Dja aij Hij) = Vay ja aij Hij) = Va Va 4ijCji = 
tr(AC). 


Proposition 14.5 Let F be a field, let n be a positive integer, and let 
ô € D(Mnxn(F)) be a linear functional satisfying 5(AB) = 6(BA) for all 
A, B € May xn(F). Then there exists a scalar c € F such that 6(A) = c - tr(A) 
forall AE Mnxn(F). 


Proof Again, for each 1 <i, j <n, let Hj; be the matrix having (i, j)-entry equal 
to 1 and all of the other entries equal to 0. If 1 <i # j < n then ôê(H;j) = 
ô(Hii Hij) = d(H; Hii) = 6(O) = 0. Moreover, for all 1 < j,k < n we have 
6(Hj;) = 6(Ajx Hkj) = 6 (Aj Hjkr) = ô (Hkk). Thus we see that there exists ac € F 
such that ô(H;;) = c for all 1 < j <n and from Proposition 14.4 we conclude that 
6(A) =tr(A- cl) =c - tr(A) for all A € Maxn(F). 


Proposition 14.6 (Taber’s Theorem) Let F be a field, let n be a positive 
integer, and let A E€ My xn(F) be a matrix the characteristic polynomial of 
which is completely reducible. Then tr(A) is the sum of the eigenvalues of A 
(with the appropriate multiplicities). 


Proof Let p(X) =} ;_ociX ‘ be the characteristic polynomial of A. We know that 
this polynomial is completely reducible, say p(X) = []_, (X — bi), and after mul- 
tiplying this out, we see that c,_1 = — }~_, bi. But from the definition of the char- 
acteristic polynomial, we also see that c,_; = —tr(A). Thus we see that, for any 
such matrix, tr(A) is the sum of the eigenvalues of A (with the appropriate multi- 
plicities). 


Example Let F be a field, let Q be a nonempty set, and let V = F®. For each a € Q 
there exists a linear functional 6, € D(V) defined by evaluation: ôa : ft f(a). In 
the case that F is R and Q is the unit interval of the real line, this functional is 
known to physicists as the Dirac functional. In analysis, evaluation functionals are 
often used to establish boundary conditions on classes of functions being studied. 
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Paul Dirac, the Nobel-prize-winning twentieth-century British physi- 
cist, built the first accepted model of quantum mechanics, in which lin- 
ear functionals played a fundamental part. 


Example Let n be a positive integer and let c and d be complex numbers. Can we 
find all matrices A E€ Myxn(C) having the property that c is an eigenvalue of A 
having geometric multiplicity n — 1 and d is an eigenvalue of A having geometric 
multiplicity 1? (Certainly one such matrix always exists, namely a diagonal matrix 
with c appearing n — 1 times on the diagonal and d once.) In general, in order for c 
to be an eigenvalue of A of geometric multiplicity n — 1, the eigenspace associated 
with it has to be of dimension n — 1. In other words, the nullity of the matrix A — cI 
must equal n — 1. From this we see that the dimension of the column space of A — cI 
by el 
must equal 1, and so there must exist nonzero vectors u = | : | andv=ļ| : | in 
bn en 
C” such that A — cI =u Av, whence A =u ^ v + cI. Conversely, if A is a matrix of 
the form u A v + c1, then c is an eigenvalue of A having multiplicity at least n — 1. 
Note that tr(A) = ae biei +nc. But, as we just noted, tr(A) is also the sum of the 
eigenvalues of A, counted by multiplicity, and so we want it to equal d+ (n — 1)c. 
Thus we are reduced to finding vectors u and v as above satisfying the condition 
that )~"_, bie; = d — c. This is easy to do in concrete cases. 


The following proposition shows that there always enough linear functionals to 
enable us to distinguish between vectors. 


Proposition 14.7 Let V be a vector space of a field F. If v # w are elements 
of V then there exists a linear functional 6 € D(V) satisfying 6(v) 4 6(w). 


Proof Since the set {v — w} is linearly independent, it can be completed to a basis 
B of V. By Proposition 6.2, there exists a linear functional ô € D(V) satisfying 
56(v — w) = 1 and 6(u) = 0 for all u € B \ {v — w}. This is the linear functional we 
want. 


In particular, if Oy Æ v € V then there is a linear functional 6 € D(V) satisfying 


5(v) £0. 
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Proposition 14.8 Let V be a vector space of a field F. Then D(V) = F®. In 
particular, if V is finitely generated over F then D(V)=V. 


Proof We have a function a: D(V) > F® given by restriction, and it is straight- 
forward to check that this function is an R-homomorphism. Since every element 
of D(V) is totally defined by its action on a basis, this function is monic and, 
by Proposition 6.2, it is epic. Therefore, it is an isomorphism. If B is finite, then 
y= F®) = FF. 


In particular, we see the important relationship between F® and F® for any 
nonempty set Q, namely that F? is isomorphic to the dual space of F‘). Note too 
that D(V) Ž V whenever the vector space V is not finitely generated over F since 
it can be shown, using the arithmetic of transfinite cardinals, that F (2) and FÈ are 
never isomorphic when is infinite. 

Let us consider the idea inherent in Proposition 14.8. Let V be a vector space 
over a field F and let B be a given basis for V. For each v € B, let 6, € D(V) 
satisfy dy(v) = 1 and 6,(u) = 0 for all v Au € B. We claim that E = {6, | 
v € B} is a linearly-independent subset of D(V). Indeed, if cj,..., cy, are scalars 
in F and u1,..., Un are elements of B satisfying the condition that pe Ciôu; 
is the 0-functional. Then for all 1 < h < n we have 0 = (X ;—] Ciôu) (Un) = 
71 Ciôu; (Un) = Cn. This establishes the claim. If V is finitely-generated then B is 
finite and so E is a basis for D(V), since it is easy to check that 6 = ee: ô(u)ôu 
for all 6 € D(V). Such a basis E for D(V) is called the dual basis of the basis B 
for V. If V is not finitely generated, then FE is a subspace of D(V) composed of 
all those linear functionals ô € D(V) satisfying the condition that (u) Æ 0 for at 
most finitely-many elements u of B. This subspace is called the weak dual space 
of V. 


Example Let V be the vector space of all polynomial functions in RÈ having de- 
gree at most 4. Suppose that B = {a 1, ..., as} is a set of distinct positive real num- 
bers and, for each 1 <i <5, let 5; € D(V) be the linear transformation defined 
by ô; : p(t) > h p(t)dt. We claim that B = {ô1,...,ô5} is a basis for D(V). 
Indeed, since we know by Proposition 14.8 that dim(D(V)) = 5, all we have to 
show is that the set B is linearly independent. That is to say, we must show that if 
there exist real numbers b1, ...,bs satisfying the condition that Y b; 6; is the 0- 
functional, then b; = 0 for all 1 < i < 5. Since )7}_, bið: 0”) = Dpi lyar bi 

a) zaf 3a gay 30) 

a 505 34) 443 35% 
for all 0 < h < 4, we must show that is nonsingu- 


tod 1-3 124 155 
as Aa L 37%5 3%5 a 5% 
lar, which is the case since this is just a nonzero scalar multiple of a Vandermonde 
matrix. 
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Example Let a < b be real numbers and let t1, ..., t be distinct real numbers in 
the closed interval [a,b] of the real line and let W be the subspace of RE con- 
sisting of all polynomial functions of degree less than n. Then dim(W) = n. For 
each 1 <i < n, let 6; € D(W) be the linear functional defined by ô; : p œ> p(t). 
We claim that the subset B = {ô1, ... ôn} of D(W) is linearly independent. Indeed, 
if $ -;_]ų ciô; is the 0-functional and if 1 < h <n, then 0 = ()~"_, ciôi) Iljin (X -— 
tj) = Ch JI jän (tn — tj), which implies that c} = 0 since the t; are distinct. Therefore, 
by Proposition 14.8, B is a basis for D(W). Since the function p t> I p(x) dx also 
belongs to D(W), we conclude that there exist real numbers c1, ...,Cn satisfying 


is p(x) dx = >~'_, ci p(t;) for any p € W. 


Let V and W be vector spaces over a field F and let a €e Hom(V, W). If 
ô € D(W) then da € D(V). Moreover, if 61,62 € D(W) and if v €e V then 
[(d1 + d2)a](v) = (61 + d2)a(v) = dj a(v) + d2a(v) = [61a + 62a] (v) and so (6) + 
62)a = 6a + doa. Similarly, if c € F and if ô € D(W) then c(ôœ) = (cd)a. There- 
fore, we see that a defines a linear transformation D(a) € Hom(D(W), D(V)) 
by setting D(a): dt da. If V, W, and Y are vector spaces over F and if 
a c Hom(V, W) and 6 e Hom(W,Y) then it is straightforward to show that 
D(Ba) = D(a) D(B). If æ is an isomorphism, then D(q@) is also an isomorphism, 
where D(a)~! = D(&7!). 


Proposition 14.9 Let F be a field and let V and W be vector spaces finitely 


generated over F. Let B = {v1,..., vg} be a basis for V, the dual basis of 
which is C = {6,..., dx}, and let D = {w1,..., Wn} be a basis for W, the 
dual basis of which is E = {n1, ..., nn}. Ifa: V — W is a linear transforma- 


tion then Pgc(D(a)) = gpa). 


Proof Let #gp(«) = [aij]. For each 1 < i < k we have a(v;) = 1 Shi Wh 
and so for all 1 <i < k and all 1 < j < n we have [D(æ)(n;)](v:) = nja (vi) = 
pai Minj (Wn) = aji. But each 6 € D(V) satisfies ô = DS 5(v;)5; and so, in 
particular, D(a)(nj) = Z LDI; = XK, aji8;, which gives the de- 
sired result. 


We have already seen that, given a vector space V over a field F, we can build 
the dual space D(V). Since this too is a vector space over F, we can go on to 
built its dual space, D*(V) = D(D(V)). What do some elements of this space look 
like? Each v € V defines a function 0, : D(V) —> F by setting 0, : dt 6(v). This 
is indeed a linear functional and so is an element of D?(V), which we call the 
evaluation functional at v. 
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Proposition 14.10 Let V be a vector space over a field F. The function 
v> Q, is a monomorphism from V to D*(V), which is an isomorphism in 
the case V is finitely generated. 


Proof We first have to show that this function is a linear transformation. And, in- 
deed, if v,w € V, if a € F, and if 6 € D(V), then as a direct consequence of 
the definitions we obtain 0,4,,(6) = 6(v + w) = ô (v) + 6(w) = Oy (6) + Oy (ô) = 
[Oy + 0w](ô) and so Oy4y = Oy + Oy. Similarly, Oav (ô) = ê (av) = ad(v) = ad, (6) 
and so Oav = a0,. Thus we have shown that we do indeed have a homomorphism. 
If v belongs to the kernel of this function then 6,(6) = 6(v) for all 6 € D(V) and 
so, by Proposition 14.5, we know that v = Oy. Thus it is a monomorphism. Finally, 
if V is finitely generated then, by Proposition 14.6, we see that dim(D?(V)) = 
dim(D(V)) = dim(V) and so any monomorphism from V to D?(V) has to be an 
isomorphism. 


We should note that the importance of Proposition 14.10 lies not in the existence 
of an isomorphism between V and D?(V), which could be inferred from dimension 
arguments alone, but in finding a specific, natural, such isomorphism. 

A proper subspace W of a vector space V over a field F is a maximal subspace if 
and only if there is no subspace of V properly contained in V and properly contain- 
ing W. By the Hausdorff Maximum Principle, we know that any nontrivial vector 
space contains a maximal subspace. The maximal subspaces of finitely-generated 
vector spaces are usually called hyperplanes of the space. We will now use linear 
functionals in order to characterize these subspaces of V. 


Proposition 14.11 A subspace W of a vector space V over a field F is max- 
imal if and only if there exists a linear functional 6 € D(V) which is not the 
0-functional, with kernel W. 


Proof Let us assume that W = ker(ô), where ô is a linear functional which is not the 
0-functional, and assume that there exists a proper subspace Y of V which properly 
contains W. Pick y € Y \ W and x € V \ Y. These two vectors have to be nonzero 
and the set {x, y} is linearly independent by Proposition 5.3, since Fy C Y and 
x €Y. Set U = F{x, y}. Then ker(6) and U are disjoint, so the restriction of ô to 
U is a monomorphism, which is impossible since dim(U) = 2 and dim(F) = 1. 
Therefore, W must be a maximal subspace of V. Conversely, let W be a maximal 
subspace of V and let y e V \ W. Then Fy N W = {0y} and Fy + W = V by the 
maximality of W. Therefore, V = Fy ® W and so every vector in V can be written 
in the form ay + w, where a € F and w e W. The function ô : ay + w œ a is a 
linear functional in D(V) the kernel of which equals W. 
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Proposition 14.12 Let V be a vector space over a field F and let ô, 5,,...,5n 
be elements of D(V). Then 6 € F{é1,...,5n} if and only if an ker(6;) C 
ker(ô). 


Proof Assume that 6 € F{6,,...,6,}. Then there exist scalars a1, ...,an such that 
ô= } ;_]aiôi. If v € N}; ker(ô;) then ô; (v) = 0 for all 1 <i < n and so ô(v) = 
yy aiôi (v) = 0. Thus v € ker(6). Conversely, suppose that Wei ker(6;) C ker(d). 
We will proceed by induction on n. First, assume that n = 1. If ô is the 0-functional, 
then surely we are done. Thus let us assume that this is not the case and let v € 
V \ ker(6). Since ker(ô1) C ker(6), this means that 6; (v) Æ 0. Seta = 6; (v)~!3(v). 
Then 6(v) = ad, (v) = (aô1)(v) and so v € ker(ô — a6,). But ker(ô1) C ker(6 — a1), 
and so this containment is again proper. By Proposition 14.11, ker(ô1) is a maximal 
subspace of V and so ker(ô — ad) = V, which shows that ô = ad. 

Now let us assume that we have prove the result for a given n and assume we 
have linear functionals ô, ô1,...,ôn+1 in D(V) satisfying niti ker(ô;) C ker(ô). 
Set W = ker(ôn+1) and for each 1 <i < n let p; be the restriction of ô; to W. 
Also, let 6 be the restriction of 6 to W. Then Mi- ker(Bi) C ker(6) and so, by 
the induction hypothesis, we know that there exist scalars aj,..., a, such that B = 
>". ai Bi. Therefore, ker(5n41) C ker(S — )7"_, aj4;) and, as in the case n = 1, it 
follows that there exists a scalar a„+1 such that ô — sy a0; = An+16n+1, proving 
that 6 = "tT aj6;. 


In the context of functional analysis, the following consequence of Propo- 
sition 14.11, taken together with the Riesz Representation Theorem (Proposi- 
tion 16.14), is known as the Fredholm alternative, and has many important applica- 
tions. 


The Swedish mathematician Ivar Fredholm was active in the late 
nineteenth century and studied the solvability of integral equations. 


Proposition 14.13 Let V and W be vector spaces over a field F, let 
a € Hom(V, W), and let w € W. Then w € im(a) if and only if w € ker(6) 
for any 6 € D(W) satisfying im(@) C ker(ô). 


Proof If w € im(q) then the given condition clearly holds. Conversely, assume that 
w ¢ im(q@) and let B be a basis for im(@). By Proposition 5.3, the set {w} U B 
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is linearly independent, and so there exists a subset B’ of W containing B such 
that {w} U B’ is a basis for W. Then FB’ is a maximal subspace of W and so, by 
Proposition 14.11, there exists a 5 € D(W) satisfying 5(w) 4 0 and im(@) C FB’ = 
ker(6). 


Exercises 


Exercise 882 

Let V = C (0, 1). From calculus we know that for each f € V there exists a max- 
imal element af of {f (t) |0<t < 1}. Is the function f +> a, a linear functional 
on V? 


Exercise 883 

Let W be a subspace of Q[X] generated by a countably-infinite linearly- 
independent set {pı (X), p2(X), ...} of polynomials. Let 6: W —> Q be the func- 
tion defined by ô : X57; ai pi (X) > S°P°, a; deg(p;) (where only finitely-many 
of the a; are nonzero). Does ô belong to D(W)? 


Exercise 884 
Let F = GF(2) and let 5: F? — F be the function which assigns to each vector 


a 

v= | b | the value (0 or 1) appearing in the majority of entries of v. Is ô a linear 
c 

functional? 


Exercise 885 
Find a linear functional 6 € D(R?) which is not the 0-functional but which satis- 


3 3 
fies ô 2 =ô 2 =0. 
—1 1 
Exercise 886 


Let V = Q[X] and to each vector v = [b1, b2, ...] € Q” assign a linear func- 
tional 5, € D(V) defined by 5, : XLo an X” > YO on!anbn+1. Is the function 
a: Q® —> D(V) defined by v +> ô, an isomorphism? 


Exercise 887 
Let V be a vector space over a field F and let æ, 8 € D(V) satisfy the condition 
that ker(B) C ker(a). Show that a € Ff. 


Exercise 888 
Let F be a field and let 0 4a € F. Leta: F[X] —> F be the function defined by 
a: p(X) p(a) — p(0). Is a a linear functional? 
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Exercise 889 

Let F be a field of characteristic 0 and let n be a positive integer. Show that 
any matrix A E€ Mnxn(F) is similar to a matrix all diagonal entries of which are 
equal to 0. 


Exercise 890 
Let V = R? and consider the linear functionals 


a a 
ôi: | b |1| 2a—b+3c, b2:| b|h3a—5b+c, and 
C c 
a 
ô3: | b | => 4a—Tb+ec 
c 


on V. Is {ô1, ô2, 63} a basis for D(V)? 


Exercise 891 

Let V be a vector space finitely generated over a field F and let W be a subspace 
of V having a complement Y in V. Show that D(V) = W’ @ Y’, where W’ is a 
subspace of D(V) isomorphic to W and Y’ is a subspace of D(V) isomorphic 
to Y. 


Exercise 892 

Let n be a positive integer and let V be the vector space of all polynomial func- 
tions from R to itself of degree no more than n. For all 0 < k < n, let ôn : V > R 
be the function defined by 6, : phe J} t* p(t) dt. Show that {ô1,..., ôn} is a 
basis of D(V). 


Exercise 893 
0 0 1 

Let B= 31,] 11],] —1 |$ CR. Find the dual basis of B. 
= = 3 


Exercise 894 

Let n be a positive integer and let V be a vector space of dimension n over a 
field F. Let B = {ô1, .. . , ôn} be a subset of D(V) and assume that there exists a 
vector Oy Æ v € V satisfying ô; (v) = 0 for all 0 < i < n. Show that B is linearly 
dependent. 


Exercise 895 

Let V be a vector space over a field F. For every subspace W of V, let E(W) = 
{8 € D(V) | ker(ô) > W}. Show that E(W) is a subspace of D(V). More- 
over, if {W; | i € Q} is a family of subspaces of V, show that EQ ieg Wj) = 
Mice E(Wi). 
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Exercise 896 

Let V be a vector space finitely generated over a field F and let W be a subspace 
of V. For E(W) = {6 € D(V) | ker(6) > W}, show that dim(W) + dim(E(W)) = 
dim(V). 


Exercise 897 

Let V be a vector space finitely generated over a field F, let W be a subspace 
of V, and let Y be a subspace of D(V). Are the following conditions equivalent: 
(1) Y = {ô € D(V) | ker(5) 2 W}; 

(2) W = [sey ker(8)? 


Exercise 898 
Let A, B € M2 x2(R). Show that tr(A B) = tr(A) - tr(B) if and only if |A + B| = 
|A| + |B]. 


Exercise 899 

Let n be a positive integer and let U be a finite subset of Mpnpxn(C) which is 
closed under multiplication of matrices. Show that there exists a matrix A in U 
satisfying tr(A) € {1,..., n}. 


Exercise 900 

Let n be a positive integer and let F be a field. For any matrix A = [aij] € 
Mnxn(F), define the antitrace of A to be antitr(A) = 7, din41-—i. Is the func- 
tion A +> antitr(A) a linear functional on My yn (F)? 


Exercise 901 
Let F be a field and let A E€ M2x2(F) be a matrix satisfying tr(A) = tr(A”) = 0. 
Is it necessarily true that A= O? 


Exercise 902 
Let k and n be positive integers. If O Æ A E€ Mxxn(R), does there necessarily 
exist a matrix B € M,,.,(R) satisfying tr(AB) 4 0? 


Exercise 903 
Let F be a field and let k # n be positive integers. Let A € Mkxn(F) and 
B € Mnxk(F). Are tr(AB) and tr(BA) necessarily equal? 


Exercise 904 


1 2-i 1+i 1 l+i 2-i 
Show that the matrices | 4+i 1+i 0 and | 3-71 1+i 0 in 
1+i 1 1 1 27 1—i 


M3x3(C) are not similar. 


Exercise 905 
Let n be a positive integer and let V be the subspace of R[X] composed of all 
polynomials of degree at most n. What is the dual basis of {1, X,..., X”}? 
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Exercise 906 
Let n be a positive integer. If B and C are elements of Mnxn(R) satisfying 
tr(B) < tr(C), and if A € Mnxn(R), is it necessarily true that tr(A B) < tr(AC)? 


Exercise 907 
For a matrix A € M3,,3(R), find a positive integer c satisfying 


tr(A) 1 0 
|A|=—|tr(A2) tr(A) 2 
“\tr(A3) tr(A2)_ tr(A) 


Exercise 908 

Let k and n be positive integers and let F be a field. Define a function 
a: Minxkn(F) > Mnxn(F) as follows: if A € Minxkn(F), write A = [Aj;], 
where each A;j is a (k x k)-block. Then set a(A) = [bij] € Mnxn(F), where 
bij = tr(A;j) for each 1 <i, j <n. Is @ a linear transformation? Is it a homomor- 
phism of unital F-algebras? 


Exercise 909 

Let A be a nonempty set and let V be the collection of all subsets of A, which is 
a vector space over GF(2). Is the characteristic function of Ø 4 D C A a linear 
functional on V? 


Exercise 910 
For each integer n > 1, find a nonsingular matrix A € Mnxn(Q) satisfying 
tr(A) = 0. 


Exercise 911 
Let n > 1 be an integer and let A E€ M,,,(R). Does there necessarily exist a 
symmetric matrix B € Mnxn(R) satisfying tr(A) = tr(B)? 


Exercise 912 

Let V be a vector space finitely generated over Q and let œ € End(V) be a pro- 
jection. Show that there is a basis D of V satisfying the condition that the rank 
of a equals tr(Ppp(a)). 


Exercise 913 

Let W be a proper subspace of a vector space V over a field F and let v e€ V\ W. 
Show that there is a linear functional ô € D(V) satisfying 6(v) 4 0 but 6(w) = 0 
for all w € W. 


Exercise 914 

Let V be a vector space finitely generated over a field F and let W; and W, be 
proper subspaces of V satisfying V = W; ® W2. Show that D(V) = E1 @ Ep, 
where E; = {5 € D(V) | W; Cker(6)} for j = 1,2. 
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Exercise 915 
Let F be a field and let A E€ M2x2(F). Show that we always have A? —tr(A)A + 
|A|J =O. 


Exercise 916 

Let V be the subspace of R” consisting of all sequences [a1,a2,...] € R 
satisfying the condition that limj_.. aj exists in R. Define linear functionals 
61, 62,..-,d9 E D(V) by setting ôn : [a1, a2, ...] œ> a, for each h = 1,2,... 
and doo : [a1, a2, ...] > limi; ai. Is the subset {61, 52, ..., d40} of D(V) nec- 
essarily linearly independent? 


Exercise 917 

Let F be a field and, for each a € F, let £a : F[X] — F be the linear functional 
defined by £4 : p(X) p(a). Show that the subset {e, | a € F} of D(F[X]) is 
linearly independent. 


Exercise 918 

Let V be a vector space over a field F and let 5, 62 E€ D(V) be linear functionals 
satisfying the condition that ô; (v)d2(v) = 0 for all v € V. Show that one of the 
6; must be the O-functional. 


Exercise 919 

Letn > 1 be an integer and let f : R” —> R be a continuous function which maps 
the 0-vector to 0 and which satisfies the condition that f (v + w) + f(v — w) = 
2 f (v) for all v, w € R”. Show that f € D(R”). 


Exercise 920 
Let a € R, let n be a positive integer, and let A, B € My (IR). Does there nec- 
essarily exist a matrix C € Mnxn(R) satisfying AC + tr(C)A = B? 


Exercise 921 

Let F be a field and let n be a positive integer. Let 6: Mnxn(F) > F be 
the linear functional given by 6 : [ajj] > X; Di 14ij. Find an endomor- 
phism æ of Mnxn(F) satisfying the condition that 6(A) = a - tr(@(A)) for all 
A € Mnxn (F). 


Exercise 922 
Let F be a field and let n be a positive integer. Let A, B € Mnxn(F) be matrices 
satisfying A? + B? = I and AB + BA = O. Show that tr(A) = tr(B) = 0. 


Exercise 923 
Let F be a field and let n be a positive integer. Given a positive integer k, is it 
necessarily true that tr((AB)*) = tr(A*)tr(B*) for all A, B € Mnxn (F)? 
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Exercise 924 
Let V be a vector space finitely generated over a field F and let æ € End(V). 
Show that œ and D(q@) have identical minimal polynomials. 


Exercise 925 

Let V be a vector space over a field F and let n be a positive integer. Let 
Ul, ..., Un be distinct vectors in V and assume that there exist a € End(V) and 
ô € D(V) such that the matrix [da'—!(v;)] E€ Mnxn(F) is nonsingular. Show 
that the set {v1,..., Vn} is linearly independent. 


Exercise 926 
Let W be a subspace of a vector space V over a field F. Show that W is a maxi- 
mal subspace of V if and only if every complement of W in V has dimension 1. 


Exercise 927 
Let F be a field and let n be a positive integer. For matrices A, B E€ Mnxn (F), 
calculate tr([AB — BA][AB-+ BAJ). 


Exercise 928 

Let n be a positive integer. Can we find matrices A, B € Myx (C) satisfying the 
condition that all eigenvalues of A and of B are positive real numbers, but not all 
eigenvalues of A + B are positive real numbers? 


Exercise 929 
Let k and n be positive integers, let F be a field, and let O # A € Myxn(F). 
Does there necessarily exist a matrix B € My xx(F) satisfying tr(A B) 4 0. 


Exercise 930 

Let V be a vector space of finite dimension n over a field F. A nonempty finite 
collection {W,,..., Wx} of hyperplanes of V is co-independent if and only if 
dim(\_, Wi) =n — k. Is a nonempty subcollection of a co-independent collec- 
tion of hyperplanes necessarily co-independent? 


Exercise 931 
If V is a vector space over R then the complexification of D(V) is isomorphic to 
Homp(V, C) as vector spaces over C. 


Exercise 932 
Let F be a field and let k and n be positive integers. If A € Mxxn(F), are 
tr(AA’) and tr(A? A) necessarily equal? 


Exercise 933 
Let F be a field of characteristic other than 2. Show that any matrix A € M2 x2(F) 
can be written in the form c7 + B, where c € F and tr(B) = 0. 


Inner Product Spaces 1 5 


In this chapter, we will have to restrict the set of fields over which we work. A sub- 
field F of R is real Euclidean if and only if for each 0 < c € F there exists an 
element d € F satisfying d? = c and a subfield K of C is Euclidean if and only if 
there exists a real Euclidean field F such that K = {a+ bi | a, b € F}. It is immedi- 
ately clear that if K is a Euclidean field and c € K, then c € K. Being a Euclidean 
field is intimately tied in with the constructibility of elements of the complex plane 
by straightedge and compass constructions, and in fact every real Euclidean field 
must contain all those real numbers which are then lengths of line segments obtain- 
able from the unit line segment by straightedge and compass construction methods. 
Clearly, R itself is real Euclidean, while Q, as we have already noted, is not; the set 
real numbers algebraic over Q is real Euclidean and properly contained in R. The 
field C is Euclidean, and the set of all algebraic numbers is Euclidean and properly 
contained in C. 

Let V be a vector space over a Euclidean field F. A function u from V x V to 
F is an inner product on V if and only if: 

(1) For each w € V, the function vt u(v, w) from V to F is a linear functional; 

(2) If v, w € V then u(v, w) = p(w, v); 

(3) If v € V then u(v, v) is a nonnegative real number, which equals 0 if and only 
if v = 0y. 

Note that, in the above situation, if v, w € V then, as a consequence of (2), 
u(v, w) + u(w, v) = 2Re(u(v, w)) is also always a real number, though it may, 
of course, be negative. 

In general, once we have fixed an inner product on a space, we will write (v, w) 
instead of u(v, w). A vector space over a Euclidean subfield F of C on which 
we have an inner product defined is called an inner product space. Another term for 
such a space, coming from functional analysis, is a pre-Hilbert space. Abstract inner 
product spaces were first studied in an axiomatic manner by von Neumann. While 
inner product spaces over general Euclidean fields may prove to be interesting in 
the future, at the moment the study of such spaces is almost universally restricted 
to spaces over R or C, and so from now on we will do the same and consider 
only these as possible fields of scalars. When we talk about an inner product space 
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without specifying the field of scalars, we will always assume that it is one of these 
two fields. 


Example Let n be a positive integer and let F be either R or C. We define an inner 
ai bı 

product on F”, called the dot product, as follows: if v = : and w = : |, 
o an bn 

then we set v - w = )~"_, aibi. Note that if F = R, then this product just coincides 

with the interior product v © w which defined earlier. However, that is not true 

for the case F = C, so we must be very careful to distinguish between the two 

products. This modification of the definition is necessary since, over C, we have 

1 1 1 ar ; 
| i | © |; | = 0, even though |: | # i Hence the interior product © is not an 


i 
inner product as we have defined it in this chapter. 


With kind permission of the Archives of the Mathematisches 
Forschungsinstitut Oberwolfach (Artin). 
j The problem arises because, in C, 0 can be writ- 
7 ten as the sum of squares of nonzero elements. 
A field F in which 0 cannot be the sum of squares 
of nonzero elements of F is formally real, so R 
; is formally real while C is not. The theory of 


formally real fields was developed in the 1920s 
by the Austrian mathematicians Emil Artin and 
Otto Schreier. 


We can generalize the previous example. If F is either R or C, and if D = [dj;] 

is a nonsingular matrix in M,.,(F), we can define an inner product on F” by set- 
ai bi bi 

ting =a E = [a as an | DD" : |, where D” = [d;;]". The 
an bn bn 

matrix D” is called theconjugate transpose or Hermitian transpose of D, and it 

again belongs to Mnxn(F). Conjugate transposes of matrices over C will play an 

important part in the following discussion; of course, D4’ = DT for any matrix 

D € Mnxn (R). 

The properties of the conjugate transpose are very much like those of the trans- 
pose. Indeed, we note that if A, B € Mnxn(C) and c € C, then (A + B)” = A” + 
B”, (cA) =cA", AĦĦ = A, and (AB)” = B” A” | In particular, if A is nonsin- 
gular then J = I” = (AAT H)” = (A7~!)4 A” , proving that (A~!)4# =(A7)~!, 


Example If we are given positive real numbers c1, ...,Cn and consider the diag- 
onal matrix D = [dij] E€ Mnxn(R) the diagonal entries of which are given by 
dii = /ci for 1 < i < n, then, by the above, we have an inner product on C” given by 
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aj by 
,{ => ne cjajb;. Such a product is called a weighted dot product. 
an bn 
Weighted dot products are extremely important in statistics and data analysis, where 
we often want to emphasize the values of certain parameters and de-emphasize oth- 
ers. 


Example Let a < b be real numbers and let V = C (a, b). This is, as we have 
seen, a vector space over R, on which we can define an inner product (f, g) = 
f. f(x)ge(x)dx. Continuity is important here. The set Y of all functions from [a, b] 
to R which are continuous at all but finitely-many points is a subspace of R!¢.?! 
properly containing C (a, b) but (f, g) = f. f(x)g(x)dx is not an inner product 
on Y. Indeed, if we select a real number c satisfying a < c < b and define the func- 
tion f € Y by 


1 ifx=c, 
0 otherwise 


fiz Í 


then f is a nonzero element of Y but (f, f) = 0. 
Similarly, if V be the set of all continuous complex-valued functions defined on 
the closed interval [a, b] in R, then V is a vector space over C, on which we can 


define an inner product ( f, g) = 1. S(x)g(x) dx. 


Example Let F be R or C, and let V = My xn(F), which is a vector space 
over F. Define an inner product on V by setting (A, B) = tr(AB”) = tr(B A). If 
A = [aij] and B = [b;;], then (A, B) = X; = aj; b;;. In particular, (A, A) = 


n n 2. 
i j=1 laijl- 


Example Let V be the subspace of C° composed of all those sequences [co, c1,..-] 
of complex numbers satisfying )°?2o |c; |? < oo. This vector space is very impor- 
tant in analysis, and we can define an inner product on it by setting ([co, c1,...], 
[do, di, Ró J) = De cidi. 


Let F be R or C, and let W be a subspace of an inner product space V over F. 
The restriction of this inner product to a function from W x W to F is an inner 
product on W. Thus we can always assume that any subspace of an inner product 
space V inherits the inner-product-space structure of V. 


Example Let V be an inner product space over R and let K be the set of all matrices 


a : 
of the form i where a,b € R and v, w € V. Then K is a vector space over 


v 
b ? 
R, where addition and scalar multiplication are defined by 


a v| ja v| _|atad v+ d a v|_|ca cv 
w b w b| |w+w b+b and Cly bl New l 
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We create the structure of an R-algebra on K by defining an operation e as follows: 
a v a v aa'+ (v, w) av’ +b'v 
E |S el a'w + bw! ere 
This algebra is a division algebra, called the Cayley algebra, and it is not associative. 


We now look at some properties of general inner product spaces. 


Proposition 15.1 Let V be an inner product space. For v, w1, w2 € V and 
for a scalar a, we have: 

(1) (v, wy + w2) = (v, w1) + (v, w2); 

(2) (v, awı) =a(v, wi); 

(3) (Ov, w1) = (v, Oy) =0. 


Proof From the definition of the inner product, we have (v, w; + w2) = 


(wy + w2, v) = (w1, v) + (w2, v) = (v, w1) + (v, w2) = (v, w1) + (v, w2), which 
proves (1). We also have (v,aw ) = (aw), v) = a(wy, v) = a (w1, v) = a (v, w1), 
which proves (2). Finally, (0y, w1) = (00y, wi) = 0, and similarly (v, 0y) = 0, 
proving (3). 


By Proposition 15.1 we see that if V is an inner product space over R then for 
each v € V the function w > (v, w) from V to Fis a linear transformation, but that 
is not the case for inner product spaces over C. 

Let V be a finitely-generated inner product space and let v1, ..., vg be a list of 
vectors in V. The Gram matrix of this list is the k x k matrix G = [g;;] defined 
by gij = (v;i, vj} for all 1 < i, j < k. Let B = {v1, ..., Vn} be a basis for V. Given 
vectors v = } /_;] ajv; and w = X= bjvj in V, we note that 


n n bı 


(w, w) =X 9 ajbj (vj, vj) = [a ee, an || f, 


i=l j=!) b 
where G is the Gram matrix of B. 
Jorgen Gram was a Danish mathematician who at the end of the nine- 


teenth century developed computational techniques for inner product 
spaces in connection with his work for insurance companies. 
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Example Let V be the subspace of C[X] consisting of all polynomials of degree 
at most 5, and let B be the canonical basis for V. Define an inner product on V 
by setting (f, g) = i f(x)g(x) dx. (Note that we are using the same notation for 
a polynomial and its corresponding polynomial function in CÈ.) Then the Gram 
matrix defined by B is precisely the Hilbert matrix He, which we have seen ear- 
lier. 


Proposition 15.2 (Cauchy-Schwarz—Bunyakovsky Theorem) Let V be an 
inner product space. If v, w € V, then |(v, w)|? < (v, v)(w, w). 


Proof If v = Qy or w = Oy then the result is immediate, and so we can assume that 
both vectors differ from Oy. Let a = — (w, v) and b = (v, v). Then a = — (v, w) and 
b=b so 
0 < (av + bw,av+ bw) =aa(v, v) +ab(v, w) + ba(w, v) + b*(w, w) 
= aab — aba — aba + b?(w, w) = b[—aa + b(w, w)]. 


Since v Æ Oy, it follows that b is a positive real number and so aa < b(w, w), which 
is what we want. 


With kind permission of ETH-Bibliothek Zurich, Image 
Archive (Schwarz). 

Herman Schwarz was a German mathematician 
who in the late nineteenth century studied spaces 
of functions and their structure as inner product 
spaces. Viktor Yakovlevich Bunyakovsky was a 
Russian student of Cauchy who proved this the- 
orem a generation before Schwarz, but since his 
work was published in an obscure journal, it was not widely recognized until the twentieth 
century. 


Example If a1,...,an,b1,...,bn,C1,...,Cn are real numbers, with c; > 0 for all 


1 <i <n, then 
n n 
< ( Sout) ( Sat) 
i=l i=l 


Indeed, this is a consequence of the Cauchy—Schwarz—Bunyakovsky Theorem, us- 
ay bi 
ing the weighted dot product : 14]: = J ;_] crab; defined on R”. 


n 
X ciaibi 


i=l 


dn bn 
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In general, the Cauchy—Schwarz—Bunyakovsky Theorem is an extremely rich 
source of inequalities between real-valued functions of several real variables. For 
Vat+b 1 
example, consider the vectors v = Ta va+c |andw=]| 1 | in R3, where 
+b+ 
e | VbFe 1 
a, b, and c are positive. Then, by the Cauchy—Schwarz—Bunyakovsky Theorem, we 
see that 


/ a+b +/ ate T b+c EPEN ET we. 


a+b+c a+b+c a+b+c 


Similarly, we note that the matrix D = ike vA € M?2x2(R) is nonsingular and 


so, by a previous example, we have an inner product jz on R? defined by 


u (Hi H = Bi Dp! H = 3(ac + bd) + (v3) (ad + bo). 


Applying the Cauchy—Schwarz—Bunyakovsky Theorem, we see that for all real 
numbers a, b, c, and d we have 


[3(ac + bd) + (V3)(ad + be) |? 
< [3(a? +b?) + (2V3)ab][3(c? + a?) + (2V3)cd]. 


In particular, if we take b = d = /3, we see that (ac+a+c+ 3)2 < (a2 +2a+ 
3)(c? + 2c + 3) for all real numbers a and c. 

Let V be an inner product space. The norm of a vector v € V is defined to be the 
scalar ||v|| = y (v, v). A vector v satisfying ||v|| = 1 is normal. 


Example Let V = R”, and endow V with the dot product. Then 


ai 


an i=l 
This norm is known as the Euclidean norm on V. 


Example Let V = C(—z,7), on which we have defined the inner product 
(f,g)= Sez ft (x)g(x) dx. For each positive integer k, consider the function 


fk: x | sin(kx). Then || fell = Wf, fe) = SEn sin? (kx)dx = ./m and so 


gk = F fk is a normal vector in this space. 


We have seen how the vector space R3, endowed with the cross product x, isa 
Lie algebra. It is easy to check that the cross product is related to the dot product on 
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R? by the relations 


ux(vxw)=(u-w)v—(u-v)w and 


(u x v) x w = (u - w)v — (v - w)u 


for all u, v, w € RÌ. Moreover, we have the following identities: 
(1) v- (v x w) =0 for all v, w € RÌ; 
(2) (Lagrange identity) ||v x w||? = ||u||7||w||? — w- w)?. 
There are only two possible anticommutative operations on R? which turn it into 


an R-algebra satisfying these two identities, namely x and the operation x’ given 
by v x’ w = —(v x w). Furthermore, if n > 3 no such operation can be defined 
on R”, except for the case of n = 7. In that case, we can define an operation x as 
v 
follows: write elements of R’ in the form | a |, where v, v’ € R? anda € R and 
v’ 
then set 
v w aw’ — bv' + (v x w) — (v x w’) 
x| b |= —v:-w+v -w 
v’ w’ bv — aw + (v x w’) — (v' x w) 
a by c1 
We also note that if u = | a |, v = | bo |, and w = | co | in R? then 
a3 b3 c3 
a b a 
u: (v x w)= |a? b2 c2|. As an immediate consequence, we observe that if 
a3 b3 C3 


u, v, w € R? then: 

(1) u- (v x w)=v. (w xu)=w.- (ux v); 

(2) u - (v x w) = 0 if and only if two of these vectors are equal or the set {u, v, w} 
is linearly dependent. 

The scalar value u - (v x w) is often called the scalar triple product of the vectors 

u, v, w, to distinguish it from the vector triple product u x (v x w). 


Proposition 15.3 Let V be an inner product space. If v, w € V and ifa is a 
scalar, then: 

(1) lavl| = lal- vl]; 

(2) ||v|| = 0, with equality if and only if v = Oy; 

(3) (Minkowski’s inequality): \|v + w|| < ||v|| + lw ||; 

(4) (Parallelogram law): |v + w|? + llv — wl? = 2o? + lwl; 

(5) (Triangle difference inequality): ||v — w|| > |||v|| — || wl|. 
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With kind permission of ETH-Bibliothek Zurich, Image Archive. 


Hermann Minkowski, a German mathematician at the end of the nine- 
teenth century, built an elegant mathematical framework for the theory 
of relativity, using four-dimensional non-Euclidean geometry. 


Proof We see that ||av|| = /(av, av) = J/aa(v, v) = jal- ||v||, proving (1). Inequal- 
ity (2) follows immediately from the definition. As for (3), note that if z = a + bi 
then z + Z = 2a < 2\a| = Wa? < Va? +b? = 2|z|. As a consequence of the 
Cauchy—Schwarz—Bunyakovsky Theorem, we see that |(v, w)| = |(w, v)| < [lull - 
|| w ||, and so 


lv + wil? = {v+ w, v + w) = (v, v) + (wv, w) + (w, v) + (w, w) 
< lol? + 2v lwl + lwl? = (lvl + Hell)”. 
and that proves (3). Moreover, we know that 
lv + wl? = (v + w, v + w) = (v, v) + w, w) + (w, v) + (w, w) 


and ||v — w||? = {v — w, v — w) = (v, v) — (v, w) — (w, v) + (w, w). Adding these 
two gives us (4). 


Finally, by (3), we have ||w|| = |w + (v — w)|| < ||w|| + Iv — wl, and so 
|v — w|| > ||v|| — ||w||. Interchanging the roles of v and w and using (1), gives 
us ||v — w|| = |w — v|| > ||w|| — || vl], and so we have (5). 


Note that by Proposition 15.3 we see that if Oy Æ v € V then TT” is a normal 
vector. Moreover, if v is normal and c is a scalar satisfying |c| = 1, then cv is again 


normal. 


Example Let V be an inner product space, and let 92 be a nonempty set. A function 
f € VÈ is bounded if and only if there exists a real number b f Satisfying || f(i)|| < 
by for all i € 2. If f,g € V? are bounded functions then, from Minkowski’s in- 
equality, we conclude that ||(f+ 9)@I < IFI + Iel < bf +bg foralli € 2. 
If c is a scalar then ||(cf)@)|| = [el - | f@I| < |clb p for all i € 2. Thus both f + g 
and cf are both bounded, and we see that the set of all bounded elements of V? is 
a subspace of V%. 


Example We now return to a previous example. Let p be an integer greater 
than 1, not necessarily prime, and let G = Z/(p), on which we have an oper- 
ation of addition as defined in Chap. 2. Let V = C°%, which is a vector space 
of dimension p over C. On this space, we can define an inner product by set- 
ting (f, 2) = neg f(n)g(n). Every element n € G defines a function h, : k œ> 
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cos( Zznk) +i sin(2"*) which belongs to V. Given a function f € V, define a func- 
tion f € V as f: n> (f, hn) = £ pec f )n(—k). This function is called the 
discrete Fourier transform of f of order p. One can show that the function f +> f 
is in fact an automorphism of V. Moreover, f(n) = ifn) and || f|| = 
for all f € V andalln € G. 


hae 
JIA 


Example There are various generalizations of Theorem 15.2 which, as a rule, re- 
quire more sophisticated methods of complex analysis to prove. For example, the 
contemporary Greek mathematicians Manolis Magiropoulos and Dimitri Karayan- 
nakis have shown that if V is an inner product space and if u, v, and w are distinct 
elements of V, then 


2|(u, v)|- |u, w)| < (u, w [lvl lwl + |w, w) |]. 


In case the set {v, w} is linearly dependent, it is clear that this reduces to the inequal- 
ity in Proposition 15.2. Inequalities such as these allow us to get better bounds on 
inner products. For example, let 0 < a < b be real numbers and let V = C (a, b), on 
which we have the inner product ( f, g) = i fœ)ge(x)dx. If u, v,w € V are given 
by u : x> 1/x, v : x > sin(x), and w : x > cos(x) then Proposition 15.2 gives us 
the bound 


b dx b = b i 
juoma (f =) [ sn (x)dx [ cos (x)dx 


whereas this result gives us the better upper bound 


1 b dx i - 2 á 2 
(f =) [ sin (x) dx f cos (x)dx + - 


Proposition 15.4 Let V be an inner product space and let a € End(V) satisfy 
the condition that there exists a real number 0 < c < 1 such that ||a(v)|| < 
cllv|| for all v € V. Then oj + «œ is monic. 


b 
i sin(x) cos(x) dx 


a 


Proof If Oy Æ v € V then, by Proposition 15.3, 


lvi = |v +e(v) —a(v)|| < |v +a(v)|| + lew) 
= | (o1 + a)(v) | + lew) 


7 


and so ||(o1 + a)(v)|| > lvl — lew) > (A — c)llvi| > 0, which shows that 
v ¢ ker(o, + æ). Thus o; + œ is monic. 


In particular, if V is a finitely-generated inner product space and if œ € End(V) 
satisfies the condition that there exists a real number 0 < c < 1 such that ||a(v)|| < 
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c||v|| for all v € V, then oj +a € Aut(V). Let £ = (oj +a)—!. If Oy Æ v € V then 


lvl = |e +) B(v)|| = || Be) +w] wl- egw) | 
> lwl- edola- olo. 


Similarly, vl] < ||B(@) |] + leb l s EO + AEI = A + olew) and so 
glioli < B0) < qh lvli for all v € V. 

Sometimes, however, we need a bit more generality. If V is a vector space over 
R or C then, in general, a function v +> ||v|| satisfying conditions (1)-(3) of Propo- 
sition 15.3 is called a norm and a vector space on which a fixed norm is defined is 
called a normed space or, in a functional-analysis context, a pre-Banach space. An 
immediate question is whether every norm defined on a vector space comes from an 


inner product. The answer is negative: if, for example, we define the norm || - ||; on 
ai 

C” by setting : = J; |aj|, then this cannot come from an inner product 
an 


1 
since the parallelogram law is not satisfied by this norm. In fact, satisfying the paral- 


lelogram law is necessary for a norm to come from an inner product in the following 
sense: let V be a vector space over R or C on which we have a norm yw: V > R 
satisfying Y (v + w)? + y (v — w)? = 2[4 (v)? + Y (w)?] for all v, w € V, and write 
à (v, w) = Hyw +w)? — wiv — w)?°]. Then it is possible to define an inner prod- 
uct on V relative to which the norm of a vector v is precisely y(v). In the case 
the field of scalars is R, then this inner product is defined by (v, w) = A(v, w) and 
otherwise this inner product is defined by (v, w) =A(v, w) + iA(v, iw). 


With kind permission of the Archives 
of the Mathematisches Forschungsin- 
stitut Oberwolfach (Wiener); © Ste- 
fan Banach (Banach). 

Normed spaces were first 
studied at the beginning of 
the twentieth century by the 
Austrian mathematician Hans 
Hahn, and then by the Ameri- 
can mathematician Norbert Wiener and the Polish mathematician Stefan Banach. 


Example Every vector space over R can be turned into a normed space in at least 
one way. Indeed, let V be a vector space over R for which we fix a basis {v; | i € Qh}. 
Then the function y : V — R defined by Y : $ cg aivi > J jeg lail can easily be 
seen to be a norm on V. 


Example Let V = C(0, 1), which is a vector space over R, and for each positive 
integer n, let f, € V be the function defined by 


1 


l—nx if0<x <z 


Taini | 0 otherwise. 
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Let || - || be the norm defined on V by the inner product (f, g) =f f(x)e(x)dx 
and let || - llo be the norm on V defined by || f |loo = sup{| f(x)| | 0 < x < 1}. Then 
Ifall = Jaq for all positive integers n, whereas || fn|loo = 1 for all positive inte- 


gers n. a there can be no real number c satisfying || fllo < clI f I| for all f € V. 


Example Let V and W be normed spaces over the same field of scalars F (which is 
either R or C). If a € Hom(V, W), set 


læ w) I 
lvl 


lar = up| ovžvev] 


where the norm in the numerator is the one defined on W and the norm in the denom- 
inator is the one defined on V. (If V is trivial then the only such a is the 0-function, 
the norm of which we set equal to 0.) Note that the fraction ||a(v)||/||v|| is just 
læ (v^) ||, where v’ is the normal vector iel v, so we see that ||æ || is just sup{||a(v’) ||}, 
where the supremum runs over all normal vectors v’ in V. In particular, if ô € D(V) 


then we define the norm of ô to be 


sl = spl e Oy #ve v}. 


Note that ||æ|| may not be finite, though it surely will be if œ is bounded. For 
example, let V be the space of all polynomial functions in RÈ on which we define 
the norm || f|| = max{ f (t) | 0 < t < 1}. Let œ be the differentiation endomorphism 
of V and, for each h > 1, let fy, € V be given by fhn : x > x”. Then 


laf _ 
[fal 


for each h > 1, showing that ||æ|| is infinite. If V is finitely generated, then we assert 
that ||o|| is finite for all a € Hom(V, W), a claim which we will justify in the next 
chapter. 


We claim that, if ||@|| is finite for all œ, then this is a norm defined on Hom(V, W), 
called norm induced by the respective norms on V and W. Indeed, as an immediate 
consequence of the definition we see that ||a|| > 0 for all a € Hom(V, W), with 
equality happening only when a is the 0-function. We also see that if œ is not the 
0-function then ||æ|| is the smallest positive real number c such that ||a@(v)|| < cllv|| 
for all v e V. (We note a subtle point here: the norms on V and W are, of course, 
different. Therefore, in the case V = W, and if we have two different norms defined 
on V, we may use one in the numerator and another in the denominator, though 
usually one uses the same norm in both instances.) 
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Now let a € Hom(V, W) anda € F. Then 


aa] = supf LEON ov#vev] 
= su jee oy #ve V) = lal: lal. 


Finally, if a, 6 € Hom(V, W) then 


læ + £l = spf CEAO LAMI ov eve v} 
= sup, BORO Oy Ave v} 


” spl læa (v) + BOI 


+ Ov#veV) < jali + l. 


If V = F” and W = F*, endowed with respective dot products and the norms de- 
fined by them, then the induced norm on Hom(V, W) does, in fact, always exist and 
is called the spectral norm. If A € Mkxn(F), then the spectral norm of A is defined 
to be the spectral norm of the homomorphism from F” to F* given by v > Av. 

In 1941, Gelfand showed that if n is a positive integer and A € Mnxn(C), then 
the spectral radius of A satisfies p(A) = limps oo X || AŻ ||, where || - || is any norm 


defined on M,,.,(C). In other words, we see that, given A € Mnpxn (C), there exists 
a sufficiently large k such that || A* || is approximately equal to p(A)*. 


Example If p is any positive integer, we can define the Hélder norm || - ||p on C” by 
ai 

setting : = DO jai ir] '/P For the case p = 2, this, of course, reduces to 
an p 

the norm coming from the dot product. The proof that this is a norm in the general 

case relies on a generalization of Minkowski’s inequality: |v + w]|p < llullp + llwllp 

for all v, w € C” and any positive integer p. This norm can be used to define a norm 

on Hom(C", C*) for positive integers k and n, by setting 


lel = sup oe 


lvllp 


ov#vev] 


for any œ € Hom(C”, C$). 
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With kind permission of The City University of New 
York (Bowker); With kind permission of UAL, FS N 191 
(Holder). 

General matrix norms were first discussed by the 
twentieth-century American mathematician Al- 
bert H. Bowker. The nineteenth-century German 
algebraist Otto Hölder was strongly influenced 
by the work of Kronecker. 


Example Let n be a positive integer. If A = [aij] E€ Mnxn(R), set ||Allc = 
max{| X; Via dijcic' | | ci, c), € {0, 1}}. This defines a norm on My yn(R), 
known as the cut norm. This norm has important applications in graph theory 
and combinatorics, but is hard to calculate. However, efficient methods of ap- 
proximating the cut norm of a matrix exist, making use of the following re- 
markable result, known as Grothendieck’s inequality: there exists a universal con- 
stant kg (not dependent of n) satisfying the condition that any normal vectors 
U1, +++ Un, W1,.--, Wn in R” and any scalars €1,..., €n, e},---,&, E€ {—1, 1} satisfy 
ie ja Udi Wj Ske Vi Vint aijee’. The precise value of the constant 
kg, known as Grothendieck’s constant, has not been determined, but the French 
mathematician Jean-Louis Krivine has shown that 1.677...< kg < 1.782.... 


With kind permission of the Archives of the Mathematisches Forschungsinstitut Ober- 
wolfach. 


The French algebraic geometer Alexandre Grothendieck is consid- 
ered one of the most influential of contemporary mathematicians. 


Example For positive integers k and n, we define the Frobenius norm or Hilbert- 
Schmidt norm of A = [aij] E€ Mixn(C) by 


k n 
Allg =yt(AA#)= |J} lajt. 


i=l j=l 


This is precisely the norm coming from the inner product on Mxxn(C) given by 
(A, B) = tr(AB"”). If A € Mgxn(C) has spectral norm ||A|| and Frobenius norm 
|| All x, then it is straightforward to show that || A|| < ||Alls < (VW |All. 


For vector spaces V finitely generated over R or C, it does not matter which 
norm once chooses. To see this, we need the following preliminary result. 
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Proposition 15.5 Let {v1,..., Un} be a finite linearly-independent subset of 
a normed space V. Then there exists a positive real number c such that 
| oy aivill > cÈ; lail) for all scalars ay,..., an 


Proof Let W be the subspace of V generated by {v 1, ..., Vn} and let Y be the subset 
of W consisting of all linear combinations )7y_ aj vi for which }7"_, |a;| = 1. Pick 
w= } ;_; aivi € W. If w = 0y, then a; = 0 for each i and so }~"_, |a;| = 0. There- 
fore, any positive real number c will do. Hence we can assume that w 4 Oy and so 
d=~"_, |a;| > 0. Moreover, y = ~"_, (aid7!)u; € Y and ||w]] > c(0_, lail) if 
and only if || || => c. Therefore, to prove the proposition it suffices to show that there 
exists a positive real number c satisfying the condition that ||y|| > c for all y € Y. 

Suppose that this is not the case. Then we can find a sequence y1, y2,... of vec- 
tors in Y such that yn = )7_, binvi with )77_, |bin| = 1 and lim; oo || ya || = 0. In 
particular, we note that |bj,| < 1 for each 1 <i <n and each h > 1. Thus, in par- 
ticular, the sequence b11, b12, ... of scalars is bounded. By the Bolzano—Weierstrass 
Theorem (which holds for both real and complex numbers), this sequence must 
therefore have a convergent subsequence. Throwing away all of yp for which bj, is 
not in that subsequence, we can assume without loss of generality that the sequence 
b11, b12, ... converges to some scalar bı. Similarly, the sequence b21, b22,... has a 
convergent subsequence and, throwing away all of the yp for which by, is not in that 
sequence, we can assume that the sequence b21, b22, ... converges to some scalar bz 
as well. Continuing in this manner, we finally obtain an infinite sequence y1, y2,... 
of vectors in Y such that, for each 1 <i < n, the sequence of scalars bj, bj2,... 
converges to some scalar bj. 

Set y = )7/_, bivi. Clearly, y € W and so not all of the b; are equal to 0. In 
particular, y ~ Oy and so ||y|| =r > 0. On the other hand, for each h > 1 we have 
lvl < lly — yall + Myall = WO Gi — binvi ll + yall < CO lbi — binl Meal) + 
llya ||. But limp—soo || yn |] = 0 and lim;—o |b; — bin| = O for each 1 <i <n, and so 
there exists an integer h so large that ||y|| < r. This is a contradiction, from which 
the result follows. 


Norms ||- lla and ||- ||p are defined on the same vector space V are equivalent 
if and only if there exist positive real numbers c and d such that cļ|v]la < |lvllp < 
d|lv||q forall ve V. 


Proposition 15.6 Any two norms defined on a finitely-generated vector space 
V over R or C are equivalent. 


Proof Let {v1,..., Un} be a basis for a vector space V over R or C on which we 
have norms || - ||a and || - ||, defined. By Proposition 15.5, there exists a scalar c such 
that || X}; aivi lle > cCQQCF_, lail) for any vector v = )~"_, av; in V. On the other 
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hand, from the triangle inequality, we have ||v|ja < )77_) lail llvilla < r OL) lail, 
where r = max{||villa, ---, llUn|la} > O and so (cr—!)llulla < |lv|lp foreach v € V. 


Interchanging the roles of || - ||a and || - ||), we repeat this proof to obtain a positive 
real number d such that ||v ||, <d||v||q for each v € V. 


Proposition 15.7 (Hahn-Banach Theorem) Let V be a vector space over a 
field F which is either R or C and let v > ||v|| be a norm defined on V . More- 
over, let W be a subspace of V and let 5 € D(W) satisfy the condition that 
jê (w)| < ||w|| for all w € W. Then there exists a linear functional 0 € D(V) 
which is an extension of 6 satisfying |0 (v)| < ||v|| forall ve V. 


Proof (1) We first consider the case F = R. Let C be the set of all pairs (Y, y), 
where Y is a subspace of V containing W and y € D(Y) satisfies the conditions 
that Y (y) < ||y|| for all y € Y and y is an extension of ô. This set is nonempty since 
|5(w)| < ||w|| surely implies that 6(w) < ||w|| and so (W, 5) € C. Moreover, we can 
define a partial order on C by setting (Y, Y) = (Y’, w’) if and only if Y C Y’ and 
Y'O) = Y Y) for all y € Y. If (Yn, Wn) | h € Q) is a chain in C, set Y = (Jpeg Yn 
and define Y € D(Y) by setting Y (y) = Yn(y) when y € Yn. This function is well- 
defined since C is a chain, and it surely belongs to C. Moreover, it is clear that 
(Yn, Wn) = (Y, y) for each h € Q. Therefore, by the Hausdorff Maximum Principle, 
C has a maximal element, which we will denote by (Yo, @). 

We want to show that Yo = V. Indeed, assume that this is not the case and let 
zE€VN Yo. Then Yı = Yo + Rz properly contains Yo and, for any co € R we can 
define the linear functional 0; € D(Y;) defined by 01 : yo +az > 0 (yo) +aco which 
surely is an extension of 0. We will be done if we can pick cg in such a manner 
that 6) (91) < |lyi|| for each yı € Yı, for if we can do that, then we would have 
(Y1, 8) € C, contradicting the maximality of Yo. If y1, y2 € Yı then 


O(y1) — 8002) = 01 — y2) < Ily1 — Yall 
= |y +z-z—yoll < lly +21] + ll-z — yall. 


This implies that —|| — z — y2|]| — 0 (y2) < ||y1 +z|| — 8 (y1). Since y2 does not appear 
on the right side of this equality nor does yı appear on the left side, we see that the 
real numbers 


dı =inff lyi +zll -—901) | y1 € Yı} 
and 
dy = sup{—||—z — y2l| — 902) | y2 € Yı} 


satisfy d2 < dı. Now choose cg to be any real number satisfying d2 < co < dı. 
We claim that 6 (yo) + aco < || yo + az|| for all real numbers a. If a = 0, we know 
it is true by the choice of 0. If a > 0 we have co < dı < |la~! yo + z|| — 0 (a7! yo) 
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and so aco < alla~!yo + zli — @(y0) = Ilyo + azli — 6 (y0), whence (yo) + aco 
< |lyvo + azl]. If a < 0 we have co > dz > —||—z — a7! yol] and so —aco > 
—al|—z — a~yoll + 00) = —llaz + yoll + 00) and so —0 (yo) — aco 
—|laz + yoll, whence 6 (yo) + aco < |lyo + az|l. 

Thus we see that 0 € D(V) satisfies 0(v) < ||v|| for all v € V. If v € V. then 
—O(v) = 0 (~v) < ||-v] = |(— DI - lull = [ul] as well as so |@(v)| < |u|] for all 
v € V, proving our result in the case the field of scalars is R. 

(2) Now assume that F = C. Since W and V are vector spaces over C, they are 
also vector spaces over R. Write 6 as ô : w > 6; (w) +iô2(w), where ô1, 52 € D(W), 
considering W as a vector space over R. Moreover, ôı (w) < |6(w)| for all w € W, 
since Re(z) < |z| for any z € C. Therefore, 5;(w) < ||w|| for all w € W and, as in 
the last part of the proof of part (1), we actually have |ô; (w)| < ||w|| for all w € W. 
By part (1), we then know that there exists a linear functional 6; € D(V) satisfying 
0ı(v) < ||v|| forall v € V. 

But i[ôı (w) +i62(w)] = iô (w) = ô (iw) = 6; (iw)+iô2 (iw) for all w € W. Since 
the real parts of both sides must be equal, we see that ô2 (w) = —ô1 (iw). Now define 
the function 8 : V > C by setting 6 : v > 6) (v) —i9) (iv). This is a linear functional 
on V, considered as a vector space over C, since clearly 6 (v + v’) = 6(v) + 0’) 
for all v, v’ € V and for each a + bi € C and v € V we have 


vV 


6 ((a + bi)v) = 01 (av + ibv) — iði (iav — bv) 
= aĝı (v) + b0 (iv) — i [a01 (iv) — b8: (v)] 
= (a + bi)[A1(v) — i, (iv)] = (a + ib)O(v). 


Furthermore, 0 is an extension of ô. 
We claim that |6 (v)| < ||v|| for all v € V. To begin with, we note that if 0 (v) = 0 
this holds, since ||v|| > 0 for all v € V. Now assume that ||v|| > 0. Then there ex- 
ists a real number r such that 8 (v) = |@(v)|e’” and so |6(v)| = @(v)e"". Since 
|0 (v)| is real, this means that 0(v)e~!” € R and so |0 (v)| = 0 (v)e™™ = 6) (e7""v) < 
e=" v|| = |e7!”| - |u|] = lvl]. Thus the proposition is proven. 


Proposition 15.8 Let V be a normed space and let W be nontrivial subspace 
of V on which we are given a linear functional ô, for which ||6|| is finite. Then 
there exists a linear functional 0 € D(V) which is an extension of ô satisfying 
Ol] = Ilê]. 


Proof For each w € W we have |d(w)| < ||6|| - || w||. Moreover, we have a norm 
vb |lu||* on V by setting ||v||* = ||65|| - ||v|] for all v € V. Therefore, by Proposi- 
tion 15.7, we know that there exists a linear functional 0 € D(V) extending ô and 
satisfying |0 (v)| < |lul|* = |[d]] - Jul], and so 2@! < |8|] for all Oy Æ v € V. Thus 


Rh 
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||| < ||6||. On the other hand, 


5 || = spf EI Ow #we w) < spl Fon | 


lwll 


Oy Ave v} = ll6ll, 


and so we have the desired equality. 


The norm ||- ||; defined on C” is important in various contexts. Let n be a pos- 
itive integer and let 0 be the function from Mnpnxn(C) to R defined by 9 : [aij] > 
max{} laij|| 1 < j < n}, which we have already seen when we defined condi- 
tion numbers. Numerical algorithms that compute the eigenvalues of a matrix, as 
a rule, make roundoff errors on the order of c0 (A), where c is a constant deter- 
mined by the precision of the computer on which the algorithm is running. Since 
the eigenvalues of similar matrices are identical, it is usually useful, given a square 
matrix A, to find a matrix B similar to A with 0(B) small. This can often be 
done by choosing B of the form PAPT!, where P is a nonsingular diagonal ma- 
trix. 


1 0 10% 
Example If A = 1 1 107? |, then 8 (A) = 1002. However, if we choose 
10* 107 1 
1° 0 0 
P=| 0 1 0 |,thenð(PAPT’=3. 
0 0 107? 


Let œ be the endomorphism of C” represented with respect to the canonical basis 
by a matrix A € Mnxn(C). Then for each v € C” we have 6(A)|lu||1 > læ @w)llı. 
In particular, if c is an eigenvalue of œ associated with an eigenvector v then 
A(A)|lull1 > llew) ll = lcl- lull: and so 0 (A) > |c|. Thus we see that 0 (A) > p(A), 
where p(A) is the spectral radius of A. This bound is called the Gershgorin bound. 
In fact, we can sharpen this result. 


With kind permission of the Archives of the Mathematisches 
Forschungsinstitut Oberwolfach (Taussky-Todd). 

Semyon Aranovich Gershgorin was a twentieth 
century Russian mathematician. Gershgorin’s the- 
orem was published in a Russian journal in 1931 
and was generally ignored, until it was noticed and 
publicized by the Austrian-born American mathe- 
matician Olga Taussky-Todd, one of the most im- 
portant researchers in matrix theory, who worked on the development of numerical linear 
algebra methods for computers after World War II. 
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Proposition 15.9 (Gershgorin’s Theorem) Let a be the endomorphism of 
C” represented with respect to the canonical basis by the matrix A = [aij] € 
Mnxn(C) and, for each 1 <i <n, let ri = Dizi |ajj|. Let K; be the cir- 
cle in the complex plane with radius r; and center aii. Then spec(a) C K = 
Ui Ki. 


by 
Proof Let c be an eigenvalue of œ and let v= | : | be an eigenvector of œ as- 
by 
sociated with c. Let h be an index satisfying |b| > |b;| for all 1 <i <n. Then 
bn #0 and Av = cv so (c — ann)bp = DL jin anjbj; and hence |c — apn||bal < 


are |anjbj| < |balrn. Thus |c — ann| < rp and so c € Kn C K, as desired. 


Proposition 15.10 (Diagonal Dominance Theorem!) Let n be a positive 
integer and let A = [aij] € Mnxn(C) satisfy the condition that |aii| > 
Ž jz |ajj| for all 1 <i <n. Then A is nonsingular. 


Proof The stated condition says that 0 does not belong to any of the circles K; 
defined in Gershgorin’s Theorem and so it cannot be an eigenvalue of A. Hence A 
is nonsingular. 


Example Let a be the endomorphism of C* represented with respect to the 


3 12 0 
canonical basis by the matrix A = 7 a : = . Then spec(A) = 
0 0 3 5 


{15.32, 4.49, 1.59 + 2.35i}. These numbers are found in the union K of the fol- 
lowing circles in the complex plane: the circle of radius 3 around the point (3, 0); 
the circle of radius 6 around the point (15, 0), the circle of radius 4 around the point 
(0, 0), and the circle of radius 3 around the point (5,0). We furthermore note that 
spec(A) = spec(A’) and so, by the same argument, we see that the eigenvalues of œ 
lie in the union K’ of the following circles in the complex plane: the circle of radius 
7 around the point (3,0), the circle of radius 1 around the point (15, 0), the circle 
of radius 5 around the point (0, 0), and the circle of radius 3 around the point (5, 0). 


'This theorem was proven by the French mathematicians L. Lévy and J. Desplanques at the end 
of the nineteenth century. It was independently rediscovered by several other algebraists, including 
Hadamard, Minkowski, and Nekrasov. 


15 Inner Product Spaces 351 


Dje 


These circles and the location of the eigenvalues can be seen in the figure above. 
Thus, the eigenvalues of a lie in K N K’. 


Since any polynomial in C[X] is the characteristic polynomial of a matrix, we 
can use Gershgorin’s Theorem to get a bound on the location of the zeros of any 
polynomial. However, there are more sophisticated methods available to get much 
better bounds. 

We will not go into the many results explicating Gershgorin’s Theorem. One of 
these, for example, states that if the union of s of the disks in the complex plane 
defined by Gershgorin circles forms a connected domain which is isolated from the 
disks defined by the remaining circles, then this domain contains precisely s of the 
eigenvalues of the given matrix. There are also many generalizations of Gershgorin’s 
Theorem, the best-known of which is the following. 


Proposition 15.11 (Brauer’s Theorem) Let œ be the endomorphism of C” 
represented with respect to the canonical basis by the matrix A = [aij] € 
Mnxn(C) and, for each 1 <i <n, let ri = DL jsi lajj|. For each 1 < i + 
j <n, let Kij be the Cassini oval {z € C | |z — aiillz — ajj| < rirj} in the 
complex plane. Then spec(a) C K = Uiz; Kij. 


bi 
Proof Let c be an eigenvalue of a and let v= | : | be an eigenvector of a asso- 
bn 
ciated with c. Let h and k be indices such that |b| > |b| > |b;| for all i Æ h, k. 
We know that by, # 0, and we can assume, as well, that bg # 0 for otherwise 
we would have ann = c, in which case surely c € K. Since Av = cv, we have 
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(c = ahh)bh = i xh an jb; So 


X anjbj 


i#h 


lc — ann||bn| = 


< J lanjllbj| < $` lanjllbel = ralbel- 
j#h j#h 


In other words, |c — ann| < ran |bk\|bn B In the same manner, we obtain |c — agk| < 
rg|bn||b|7! and so, multiplying these two results together, we see that |c — apn ||c — 
akk| < Fhrk, SO € E€ Kpg C K, as desired. 


Note that Gershgorin’s Theorem involves n circles, whereas Brauer’s Theorem 


involves (5) = 5n(n — 1) ovals. 


With kind permission of the Archives of the Mathematisches 
Forschungsinstitut Oberwolfach (Brauer). 

The twentieth-century German mathematician Al- 
fred Brauer emigrated to the United States in 
1939; his research was primarily in matrix theory. 
Giovanni Domenico Cassini was a seventeenth- 
century Italian mathematician and astronomer. 


Example It is sometimes useful to consider norms on vector spaces V not over 
subfields of C, namely functions v +> ||v|| from V to R satisfying conditions (1)—(3) 
of Proposition 15.3. For example, let F be a finite field and let V = F” for some 
positive integer n. Define ||v|| to be the number of nonzero entries in v, for each 
v € V. This function is called the Hamming norm and is of extreme importance in 
algebraic coding theory, where one is interested in vector spaces over F in which 
every nonzero vector has a large Hamming norm. In an example at the beginning 
of Chap. 5, we showed a vector space of dimension 3 over GF(2), every nonzero 
element of which has Hamming norm equal to 4. 


With kind permission of the Special Collections & Archives, Dudley Knox Library, 
Naval Postgraduate School. 

Richard Hamming, a twentieth-century American mathematician 
and computer scientist, is best known for his development of the the- 
ory of error-detecting and error-correcting codes. 


If v and w are vectors in space V over which we have a norm defined, then 
the distance between v and w is defined as d(v, w) = ||v — w|. If v € V and OF 
U CV, we define the distance of v from U by d(v, U) = inf{d(v, u) | u € U}. 
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When V = R” on which we have the dot product, this just gives us the ordinary 
notion of Euclidean distance. The ability to define the notion of distance in such 
spaces is important, since it allows us to measure the degree of error in algorithmic 
computations by measuring the distance between a computed value and the value 
predicted by theory. It also allows us to define the notion of convergence. 

The following proposition shows that this abstract notion of distance indeed has 
the geometric properties that one would expect from a notion of distance. 


Proposition 15.12 Let V be a normed space and let v, w, y € V. Then: 
(1) d(v, w) = d(w, v); 

(2) d(v, w) > 0, where equality exists if and only if v = w; 

(3) (Triangle inequality) d(v, w) < d (v, y) + d (y, w). 


Proof This is an immediate consequence of Proposition 15.3. 


Example Let A be a finite set, and let V be the collection of all subsets of A, which 
is a vector space over F = GF(2). We have a norm defined on V by letting ||B|| be 
the number of elements in B. Then the distance between subsets B and C of A is 
||B + C||, namely the number of elements in their symmetric difference. 


If A and B are nonempty subsets of a space V over which we have a norm 
defined, then we set d(A, B) = inf{d (v, w) | v € A and w € B}. In particular, if 
v € V and B is a nonempty subset of V, we set d(v, B) = d({v}, B). 

Let n be a positive integer. If A = [aij] E€ Mnxn(C), and if k > 0 is an in- 
teger, let us define the matrix P(k) = [pi] to be J + a nA". We claim 


that, for each fixed 1 <i, j < n, the limit limp_.o p” exists in C. Indeed, if 
B = [bij] € Mnxn(F), set m(B) = max) <j, j<n |bij|- Then every entry in the ma- 
trix A? equals the sum of n products of pairs of entries of A and so, in absolute 
value, is equal to at most m(A)*n. Thus we see that m(A°) < m(A)2n. Similarly, 
m(A?) < m(A?)m(A)n < m(A)?n? and so forth. Thus, in general, 


1 kal 1 
m( 74") < (Ay < [mAn] 


and so, in particular, m (P (k)) < ys am(Ayk for all k > 1. But from calculus we 
know that the series Dro are converges absolutely to e” for each real number r. 
Therefore, the limit we seek exists, and, at least by analogy, we are justified in 
denoting the matrix [limp_, oo Di | by e^. 
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Matrix exponentials were explicitly studied by the American mathemati- 
cian William Henry Metzler at the end of the nineteenth century. They 
appear earlier in the work of Laguerre and Peano. 


Proposition 15.13 [fn is a positive integer A = [aij] E€ Mnxn(F) is a diag- 
onal matrix, where F is R or C, then e^ = [bij] is a diagonal matrix with 
bi; =e forall 1 <i<n. 


Proof This is an immediate consequence of the definition. 


In particular, e? = I. Moreover, this implies that if B € Myxn(F) is similar to 
a diagonal matrix then B and e? have the same eigenvectors, while the eigenvalues 
of e? are the exponentials of the eigenvalues of B. 


A O ... O 
O A.. 0 
Actually, we can do a bit better: if A = . . ‘ , where each Ap 

O O Am 
el O... O 
O e%2 O 

is a square matrix, then e4 = 

O O ... eam 


Example If O# A € My xn(F), where F is either R or C, is a nilpotent ma- 


trix with index of nilpotence k, then e4 = J + x 1 A’. Thus, for example, if 


0 1 2 1 1 3 
A=|0 0 -I|,wehaveeA=J+A4+4A7=/0 1 -i 
00 0 00 1 

O k 


Example If A = 0 0 


eB | cos(r) sin(r) | 


| € M2x2(R), then e4 = E ipis- E a then 


—sin(r) cos(r) 
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111 
Example If A=] 1 1 1 | €M3,x3(R), then 
111 
2 1,3 1,3 1 1,3 1 
gq0ge 37-3 Be = 3 
1,3 1 2 1.3 1,3 1 
Paige 3 3+3 30-3 
1,3 1 13 1 2 1,3 
ge 3 373 3t3e 


If P € Mnxn(F) is nonsingular, where F is R or C, then 
Ti 
-1 h —1 
joga Sa AP)" 
h=1 


for each k and so P~!e4 P = e? '4?. Thus we see that the exponentials of similar 
matrices are themselves similar. This is very important in calculations. In particular, 
if A is diagonalizable there exists a Peni matrix P such that PT!AP is a 
diagonal matrix D = [d;j] and so P7 le^ P =e? is also a diagonal matrix. Thus e^ 
is diagonalizable whenever A is. 

If A, B is a commuting pair of matrices in Mpx n(E J then as a direct conse- 
quence of the definition we see that efe? = e4+8 = eP e^. But this is not true in 
general, as the following example shows. 


Example ra=[) 5 | and 8 = o o then 


ap_[l 1]fe' 0] fe! 1 eTl ol al A+B 
PON ilo: alele ael tier 


Example The condition that A, B be a commuting pair is sufficient for e4e? = 


e4+8 to hold, but is not necessary. Thus, for example, if A = | j a and 


-m 0 
al 0 (1 +44/3)x 
(—7 + 44/3) 0 


AGB — J = A+B 


| then AB Æ BA, but e^ = e? = —I so 


This fact is significant when it comes to calculating e^ in many cases. For ex- 
ample, suppose that A is an n x n matrix having a single eigenvalue c of mul- 
tiplicity n. Then for each scalar t, ae matrices ctI and t(A — cI) commute and 
soc? ae) SCFM y 5 n (A= c1)* and, from the Cayley—Hamilton 
Theorem, we know that (A — c1)* = O for all k > n. Thus we see that e/4 = 
(eT) Yia z a (A — cI)‘ and so there exists a polynomial p(X) € F[X] satisfy- 
ing e^ = p(A). 
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Thus we can put much of what we said together. Given a matrix A E€ Myx,(C), 
we know by Proposition 13.7 that it is similar to a matrix in Jordan canonical form. 
That is to say, there exists a nonsingular matrix P such that P~! AP is of the form 

A O ... O 


O A ... O 
. f : , where each A; is a square matrix of the form c; I + Nj, for 
O O ... Am 
N; a nilpotent matrix of a particularly simple form. Thus, for each i, we have e^i = 
c1eN1 O oo O 
O eeM |, O 
ecieNi and so e4 = P 7 ; . i P7—!. Moreover, each ei 
O O Le efmeNm 


is just p;(N;) for some polynomial p;(X) € CLX]. 
We also note that any matrix A commutes with —A, so e 
O =], proving that e4 is nonsingular and e~4 = (e4)~!. Therefore, we have a 
function A > e^ from Mpnxn(F) (where F is either R or C) to the set of all non- 
singular matrices in Mnxn(F), which is not monic. In the case F = C, this function 
is in fact epic. If A € My xn(F) then a matrix B E€ My xn(F) is a matrix logarithm 
of A if and only if A = e? . From the previous discussion, we see that only non- 
singular matrices have logarithms. If F = C, then every nonsingular matrix has a 
logarithm, but not necessarily a unique one. 


AeA — gA-A — 


0 0 2x 0 
A and B are logarithms of 7, which are not even similar. 


Example If A= i l and B = l- a then e^ = e? = I. Therefore, both 


A similar proof can be used to show that if A has distinct eigenvalues {c1,..., Cn} 
and if p(X) = J [jr (ck — cj) 1X —cj1) for all 1 < k <n then for any scalar t 
we have e’4 = X`% e% p(A). 

What about, say, cos(A) and sin(A)? We know that the cosine function has a 
Maclaurin representation 


iby 


cos(x) = >= a 


i=0 


For each natural number n, let us consider the polynomial 


n = į 
putX) =) a 
i=0 


Then we can surely calculate p,(A) for each n and see whether the sequence of 
such matrices converges in some sense. However, there is another possibility. We 
know that for any real or complex number z we have cos(z) = Mei? + e™'?] and so 
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we can just define cos(A) to be the matrix sleiA +e 4), which we know always 


exists. 


cos(1) 0 0 
Example We see that cos(J) = 0 cos(1) 0 and 
0 0 cos(1) 
111l 5 + $ cos(3) $ cos(3) = 1 $ cos(3) = t 
cos 1 1 1 — $ cos(3) — 5 a + 4 cos(3) 4 cos(3) — t 
1 1 1 1 1 1l 1 2,1 
3 cos(3) — 3 3 cos(3) — 3 3+3 cos(3) 


Similarly, we know that sin(z) = F Ie! — e~'®] and so we can define sin(A) to 
be =} [e!4 — e/4]. 


Exercises 


Exercise 934 
Let V = C(—1,1) and let a > —5 be a real number. Is the function m : 


V x V —> R defined by (f, g) = JLE — t°]! f (t)g(t)dt an inner product 
on V? 


Exercise 935 
Is the function u : R? x R? — R defined by 


a2 


n: (| : a b> ai (bi + b2) + a(b1 + 2b2) 


an inner product on R?? 


Exercise 936 
Is the function u : R? x R? — R defined by 


u: (a i kal) +> abı — aybz — arb, + 4azb2 
az bz 


an inner product on R?? 
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Exercise 937 
Is the function u : R? x R? — R defined by 


a by 
H: a: |,| bo > abı + 2a2b2 + 3a3b3 + a1b2 + a2b1 
a3 b3 


an inner product on R?? 


Exercise 938 
Verify whether the function u : R[X] x R[X]— R defined by w: (f,g) Bb 
deg( fg) is an inner product on R[X]. 


Exercise 939 
Give an example of a function u : R? x R? —> R which satisfies the first two 
conditions of an inner product, which does not satisfy the third, but does satisfy 


«([o} Lo) =" 


Is the function  : R[X] x R[X] — R defined by 


oo 00 œo oœ 1 
H: (Sax. $nx) > Dea 
i=0 j=0 bard oe 


i=0 j=0 
an inner product on R[X]? 


Exercise 941 

Let V be the vector space of all continuous functions from R to itself. Let 
u: V x V > R be the function given by n : (f, g) œ> limysoo tf, St (s)g(s) ds. 
Is u an inner product? 


Exercise 942 

Let V be a vector space over C and let u : V x V > C be a function satisfying 
the following conditions: 

(1) For each w € V, the function v +> u(v, w) from V to C is a linear functional; 
(2) If v, w € V then u (v, w) = u (w, v); 

(3) If v € V satisfies u(v, w) = 0 for all w € V, then v = Oy. 

Is u an inner product on V? 
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Exercise 943 
Let V be the vector space of all continuously differentiable functions from the 
interval [a, b] in R to R. If f, g € V, define the Sobolev inner product 


b b 
f8)= f fingcar+ | f'(t)g' (t) dt, 


where f’ and g’ are the derivatives of f and g, respectively. Verify that this is 
indeed an inner product on V. 


With kind permission of the Archives of the Mathematisches Forschungsinstitut Ober- 
wolfach. 


Sergei Lvovich Sobolev was a twentieth-century Russian mathematician 
who worked primarily in functional analysis. 


Exercise 944 
Let {u, v} be a linearly-dependent subset of an inner product space V. Show that 
lul?v = (v, u)u. 


Exercise 945 


Let n be a positive integer and let v = | : | € R”. Is the function py : 


Cn 
R[X1,..., Xn] x R[X1,..., Xn] — R defined by 


Hy = (p,q) +> P(C1, ++, Cn) (C1, ++ +5 Cn) 
an inner product on R[X1,..., Xn]? 


Exercise 946 

Let a < b be real numbers and let V = C (a, b). Let hg € V be a function sat- 
isfying the condition that ho(t) > 0 for all a < t < b. Show that the function 
H: V x V — R defined by n : (f, 8) => J? f(x)e(x)ho(x)dx is an inner prod- 
uct on V. 


Exercise 947 
Let c and d be given real numbers. Find a necessary and sufficient condition that 


the function u : (a ; p 1) > ca,b, + daz2b2 be an inner product on R?. 
2 2 
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Exercise 948 
Is the function u : R? x R? —> R defined by 


al bı 
u:| | a |,| bo | | = ab +ba + (a3b3)? 
a3 b3 


an inner product on R?? 


Exercise 949 
Let V be an inner product space over R and let n > 1 be an integer. For positive 


real numbers a1, ... , an, define the function u : V” x V” > R by 
VI w1 ñ 
u: ae lie > J ai(vi, wi). 
Vn Wn i=l 


Is u an inner product on V”? 


Exercise 950 

Let n be a positive integer and let V be the subspace of R[X] consisting of all 
polynomials of degree at most n. Is the function u : V x V —> R defined by 
u: (p,q) = Yio PCG) an inner product on V? 


Exercise 951 
Let 0 <n € Z. Is the function u : C” x C” > C defined by 


ai bı P 
H: TOE > J aibri 
i=l 


an bn 


an inner product? 


Exercise 952 

Let V = C? on which we have defined the dot product, and let D = {v € V | 
; 1 

|u|] = 1}. Find {(Av, v) | v € D}, where A = p o € Max2(C). 

Exercise 953 

Let n be a positive integer and let {v1, . . . , vg} be a set of vectors in R” satisfying 


v; -vj <0 forall 1 <i < j <k. Show that k < 2n and give an example in which 
equality holds. 


Exercise 954 
Letn be a positive integer and let A € Mnxn (C) be idempotent. Is A” necessarily 
idempotent? 
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Exercise 955 
Let V be an inner product space and let v Æ v’ be vectors in V. Show that there 
exists a vector w € V satisfying (v, w) 4 (v’, w). 


Exercise 956 

Let V be an inner product space finitely generated over its field of scalars, and 
let B = {v1,..., Vn} be a basis of V. Show that there exists a basis {w1,..., Wn} 
of V satisfying the condition that 


1 ifi=j, 
(vi, wi) = T ee 
Exercise 957 
Let W be a subspace of a vector space V over R and let Y be a complement of 
W in V. Define an inner product on W and an inner product v on Y. Is the 
function from V x V > R defined by (w + y,w’+y) bh uw, w^) +0, y^) 
an inner product on V? 


Exercise 958 
Let V=C(0, 1). Let A = { fi, ..., fn} be a linearly-independent subset of V and 
define a function u : R x R > R by u : (a, b) > ae fj (a) cos/(b). Show that 


if h € V and if there exists a function g € V such that h(x) = Ts u(x, y)g(y) dy 
for all x € R, then h € RA. 


Exercise 959 

Let V be an inner product space over R. For each real number a, set 
U(a) = {v € V | (v, v) < a}. Given a real number a, find a real number b such 
that (v + w, v + w) € U (b) forall v, w € U (a). 


Exercise 960 
Let V be an inner product space and let œ € End(V). Show that (œ (v), v) (v, a(v)) 
< |jæ(v)||? for every normal vector v € V. 


Exercise 961 
For real numbers a1, ..., an, show that 


n n n 
$04([Eue)( Han) 
i=l i=l 


i=1 


Exercise 962 
(Binet-Cauchy identity) For u,v,w,y € R3, show that (v x w)(y x u) = 
(vu-y)(w-u)—(v-u)(y-w). 
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Exercise 963 
Let v be a normal vector in RÌ. Show that the function a, : R? —> R? defined by 
Qy : W > v x (v x w) + w is a projection in End(R?). 


Exercise 964 
Let V be an inner product space and let a € End(V) be a projection. Does it 
necessarily follow that ||æ(v)|| < ||v|| for all v € V? 


Exercise 965 


For u, v, w, y € RÌ, show that (u x v) - (w x y) = vew vey 


u: w | 


Exercise 966 


For nonnegative real numbers a, b, and c, show that 


(atb+ov2<Vae4+P?4+VP 424+ Va2+0?2. 


Exercise 967 
For real numbers 0 < a < b < c, show that 


Vb? +c? < (V2)a < V(b — a)? + (c —a)?. 


Exercise 968 
Let n be a positive integer and let A € Mnxn(R) be a matrix the n Gershgorin 
circles of which are mutually disjoint. Prove that all of the eigenvalues of A are 


real. 


Exercise 969 
Show that [ fi f@œ)dx]? < fy f(x)? dx for any f € C(O, 1). 


Exercise 970 
Let f : R —>R be the constant function x +> 1. Calculate || f || when f is consid- 
ered as an element of C(O, 5) and compare it to || f ||, when f is considered as 


an element of C (0, 7). 


Exercise 971 
Let V be an inner product space over R and let v, w € V satisfy ||v + wl| = 
lvl + iwl]. Show that ||av + bw|| =allvi| + b||w|| for allO<a,beER. 
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Exercise 972 
(Real polarization identity) Let V be an inner product space over R. Show that 


(u, v) = (llu + vl]? — llu — vI’) for all u, v € V. 


Exercise 973 
(Complex polarization identity) Let V be an inner product space over C. Show 
that (u, v) = 4 (lu + vl? — lu — vl? + illu + ivl? — illu — ivl?) for all u, v € V. 


Exercise 974 

Let V = C(0,1) on which we have defined the inner product (f, g) = 
ie f@)g(t) dt. Let W be the subspace of V generated by the function x > x”. 
Find all elements of W normal with respect to this inner product. 


Exercise 975 
Let V be an inner product space over R and assume that v, w € V are nonzero 
vectors satisfying the condition (v, w) = ||v|| - || w||. Show that Rv = Rw. 


Exercise 976 

Let V be a vector space over R on which we have two inner products, jz and pu’ 
defined, which in turn define distance functions d and d’ respectively. If d = d’, 
does it necessarily follow that u = u’? 


Exercise 977 
(Apollonius’ identity) Let V be an inner product space. Show that 


1 1 ? 
lu — wI? + Iv — wI? = 5u — ni +2) (0+) — w 


forallu,v,weV. 


The Greek geometer Apollonius of Perga, who worked in Alexandria 
in the third century BC, in his famous book Conics, was the first to 


2 66 


introduce the terms “hyperbola”, “parabola”, and “ellipse”. 


Fae) 


Exercise 978 

Let n be a positive integer and let || - || be a norm defined on C”. For each 
A € Maxn(C), let ||A|| be the spectral norm of A. If A € Mnxn(C) is non- 
singular, show that every singular matrix B € My xn(C) satisfies || A — B|| > 
|| A~!||~!. Does there necessarily exist a singular matrix B for which equality 
holds? 
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Exercise 979 
Let V be an inner product space over R and consider the function 6: V x V x 
V = R defined by 


Ov, w, y) = lv +w + yl? + llv +w + yl? lv- w -— yl? — lv- w + yl. 


Show that, for any v, w, y € V, the value of 0 (v, w, y) does not depend on y. 


Exercise 980 
Let V be an inner product space over R and let n > 2. Let 9: V” —> V be the 
vi 
function defined by 0: | : | => 1 1 Vi. Show that 
Un 
2 2 
n Ul n Ul 
: 2 
Yifu-el| : = So lvl? -n 
i=l Un i=l Un 


Exercise 981 
Let V be an inner product space over R and let v and w be nonzero vectors 
in V. Show that |(v, w)|? = (v, v} (w, w) if and only if the set {v, w} is linearly 
dependent. 


Exercise 982 

Let V be a finitely-generated inner product space and let B = {v1, ..., Vn} be a 
set of vectors in V. Show that B is linearly dependent if and only if its Gram 
matrix is singular. 


Exercise 983 
Let V be a vector space over R and let || - || be a norm defined on V. Show that 
[vll — wlll < lv — wl] for all v, w € V. 


Exercise 984 

Let V be an inner product space finitely generated over R and let ô € D(V). Pick 
vo € V. Show that for each real number e > 0 there exists a real number d > 0 
such that |ô (v) — ô(vo)| < e whenever ||v — vol| < d. 


Exercise 985 
Let V be an inner product space. For any u, v, w € V, show that 


0 1 1 1 
1 0 d(u, v}? d(u, w) 

1 dtu, vy? o: dwuw” 
1 dtu,w)* d(v, w)? 0 
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Exercise 986 
Let n > 1. Show that there is no norm v b> |\v|| defined on C” satisfying 


Allg =sup( 4! | Oy # v eC") for all A € Mnxn(C). 


Exercise 987 
Let n > 1 be an integer. For each A € Myyn(C), let ||A|| = e(A), the spectral 
radius of A. Does this turn Myx» (C) into a normed space? 


Exercise 988 

Let V be the vector space of all continuous functions from the unit interval [0, 1] 
on the real line to R and for each f € V, set || f|| = i | f (| dt. Is || - || a norm 
on V? 


Exercise 989 
Let p > 2 be prime and let n be a positive integer. For each 1 <i < n, 
define w(i) = min{i — 1, p — i + 1}. Does the function GF(p)” — R defined 


ay 

by | : | J; w(i)ai turn GF(p)” into a normed space? 
an 

Exercise 990 


Let 0 < p < 1 and let f : R” — R be the function defined by 


a n 1/p 
fi) 2 1 (Zar) . 
i=l 


an 


Show that this is not a norm but does satisfy the inequality f(v + w) < 
20-P)/PL f(v) + f(w)] for all v, w € R”. 


Exercise 991 


Let Vj,..., Vn be normed spaces over the same field and let V = Te | Vi. For 

each 1 <i < n, denote the norm defined on V; by ||- ||; and define a function 
vı 

ve livi] from V to R by setting ; = } ;_ llvilli. Is this a norm on V? 
Vn 

Exercise 992 


Let V be a normed space and let Oy 4 vo € V. Show that there exists a linear 
functional 0 € D(V) satisfying ||@|| = 1 and 0 (vo) = || voll. 


Exercise 993 
Let n > 1 and let A € Mnxn(C). Show that there are infinitely-many other ma- 
trices in Mpnxn(C) having the same Gershgorin circles as A. 


366 15 Inner Product Spaces 


Exercise 994 

Let A = [aij] € M2 2(C) and let K be the Cassini oval defined by A. Show that 
every point on the boundary of K is an eigenvalue of a matrix B € M2x2(C) 
defining the same Cassini oval. 


Exercise 995 
Let n > 1 be an integer and let A = [aij] € Mnxn(R). For any e > 0, show that 
there exists a nonsingular matrix B € Mnxn(R) satisfying || A — Bl|z < e. 


Exercise 996 
Letn be a positive integer and let A = [ajj] € Mnxn(C). Let f : R > Mnxn (©) 
be defined by f : t > e’4. Show that the derivative of f is given by f’ : t > Ae’4. 


Exercise 997 
Let F be a field and let n be a positive integer. Let a € End(F”) and let V be a 


subspace of F disjoint from ker(a). If || - || denotes the Hamming norm on F”, 
is it necessarily true that ||v|| = ||a(v)|| for all v € V? 
Exercise 998 


Let n be a positive integer and let œ be an endomorphism of C” represented 
with respect to the canonical basis by a matrix A € Mpnxn(C). Then the canon- 
ical inner product on C” defines norms on C” and on End(C”). Show that 
p(A) < |la*||!/* for any integer k > 0. 


Exercise 999 
Let V and W be normed spaces over R or C and let a: V > W be a linear 
transformation for which ||æ|| exists. Show that D(a) satisfies || D(a)|| = |la||. 


Exercise 1000 

Let V be a vector space finitely-generated over a field F and let L be the set of 
all subspaces of V. For W, Y € L, define d(W, Y) = dim(W + Y) — dim(W N Y). 
Does this function satisfy the conditions of Proposition 15.12? 


Exercise 1001 
1 


1 
For each real number ft, set A(t)=| 1 0 1 | € M3x3(R). Does there exist a 
1 1 ż 
value of t for which || A(t) ||¥ = ||A(@|l2? 


Exercise 1002 


t 1 0 0 
1 ¢ 0 0 : 1 

For each ¢ > 0, let f(t) = 001z . Calculate lim;—> oo 7 f (t). 
0 0O ft 1 


Exercises 367 
Exercise 1003 


Let V be a vector space over R and let Y be its complexification. Define a func- 
tion u : Y x Y> C by 


u: (a H) b> (v1, w1) + (v2, w2) + i| (vw, wi) — (v1, w2) J. 


v2 w2 
vi 
v2 
Exercise 1004 


Let V be a normed space and let œ € End(V) have an induced norm satisfying 
llæl| < 1. Show that o — œ € Aut(V). 


Show that jz is an inner product on Y and calculate 


for each [>] EY. 
v2 


Exercise 1005 

Let V be the set of all “infinite matrices” A = [a;;], where a;; € R for alli, j > 0, 
which is a vector space over R with addition and scalar multiplication defined 
elementwise. Let p > | be a real number and let g be a real number satisfying 
; + ; = |. Let W be the subset of V consisting of all those matrices A satisfying 


the condition that 77° [092 |aij|41?/4 is finite. Show that W is a subspace of 
V and that the function At Os De |aj;|71?/7)'/? is a norm on W. 


Orthogonality 1 6 


Let V be an inner product space and let Oy Æ v, w € V. From Proposition 15.2 we 
see that 
ke (v, w) + (w, v) 
2|lull- wih 


and so there exists a real number 0 < t < x satisfying 


(v, w) + (w, v) 


cos(t) = 
“i Zliv] - wil 


This number ¢ is the angle between v and w. Note that if we are working over R, 
then 
(v, w) 


cos(t) = ————_.. 
n lvli- wl 


Example If V = R” is endowed with the dot product, and if Oy Æ v, w € V then, 
using analytic geometry, it is easy to show that the angle as defined here is indeed 
the angle between the straight line determined by v and the origin, and the straight 
line determined by w and the origin. If we define different inner products on V, we 
build in this manner various non-Euclidean geometries in n-space. 


Example Let V = C (0, 1), on which we have defined the inner product (f, g) = 
h f(x)e(x)dx. In particular, consider the functions f : x +> 5x? and g : x > 3x. 
Then || f || = V5 and lell = J/3, and the angle t between f and g satisfies cos(t) = 


J5 Jo Ox20Gx) dx = VIS. 


Vectors v and w in an inner product space V are orthogonal if and only if 
(v, w) = 0. In this case, we write v L w. We note that if v L w then ||v + w||? = 
lvl? + w, w) + (w, v) + lwll? = lvl? + ||w]]?. A nonempty subset D of V is a set of 
mutually orthogonal vectors if v L w whenever v Æ w in D. If {v1,..., Vn} is a mu- 
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tually orthogonal set of vectors in V then one shows, similarly, that || )°/_ vj |? = 


diet Mell’. 


Example We have already seen that if v, w € R3, then v - (v x w) = 0. This says 
that a vector v is orthogonal to v x w, for any vector w. The same is also true for w 
and v x w and so we see that if {v, w} is a linearly-independent subset of R? then 
the set {v, w, v x w} is linearly independent and so is a basis of R°. 


Moreover, as an immediate consequence of the Lagrange identity on R*, we see 


0 0 0 
thatif v x w= | 0 | andv-w=Otheneitherv=]| 0 | orw=|] 0 |.Ifv, we R3, 
0 0 0 


then the angle t between them satisfies the condition that v- w = (||v|| - ||w||) cos(t). 
Using the Lagrange identity, we see that 


2 20112 2 2H V2 2 
lv x wl? = lull? lwl? — w w? = lull wil" [1 — cost] 
2H 2 an2 
= |v lw" sin"), 


vxw|| 


I 
lvli lwll 


and so ||v x w|| = (llv]| - || w|[)| sin@)|. Thus | cos(t)| = 


Example If V = C? on which we have the dot product, then it is easy to see that 
2+ 3i L 1+i 
—1+5i =i | 
Example Let V = C(—1, 1), on which we have defined the inner product (f, g) = 


Ji f(x)ge(x)dx. For all i > 0, define the functions p; € V as follows: po: xt 1; 
pı : x |> x; and 


2h+1 ( ) h ( ) 1 
: a = Ss = henever h> 1. 
Ph+1:X b> h I XPn\x i I pPh-1\x W. 


These polynomial functions are known as Legendre polynomials. It is easy to verify 
that p; L pp whenever i Æ h. 
On the same space, we can define another inner product, namely 


1 
_ fl EO 1. 
=i V1—x? 


(f; 8) 


For each i > 0, define the function q; € V by setting gg: x > 1; qı : xt x; and 
qn+1 : x +» 2xqgn(x) — qn—1(x) whenever h > 1. These polynomial functions are 
known as Chebyshev polynomials. It is again easy to verify that q; L gn whenever 
ith. 

Both of the these products are special instances of a more general construction. 
For any —1 < r,s € R, it is possible to define an inner product on C(—1, 1) by 
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setting (f, g) = J} fg — x)’ + x) dx. The set of polynomial functions 
which are mutually orthogonal with respect to this inner product is called the set of 
Jacobi polynomials of type (r, s). Such polynomials are important in many areas of 
numerical analysis, and in particular in numerical integration. 


With kind permission of the Bibliothèque de l’Institut de 
France (Legendre). 

Adrien-Marie Legendre was one of the first-rate 
mathematicians who worked in France during the 
time of the revolution and the generation after it. 
Among other things, he served on the committee 
that defined the metric system. Pafnuty Lvovich 
Chebyshey, a nineteenth-century Russian math- 
ematician, made important contributions to both 
pure and applied mathematics. 


Proposition 16.1 Let V be an inner product space over a field of scalars F. 

(1) Ifv eV satisfies v L w for all w € V, then v = 0y. 

(2) If @ #4 ACV andifv € V satisfies the condition that v L w for all 
w € A, then v L w for we FA. 


Proof (1) is an immediate consequence of the fact that if v # Oy then (v, v) 4 0. 
Now assume that Ø # A C V and that v L w for all w € A. If y € FA then there 
exist elements w1, ..., Wn € A and scalars a},..., an such that y = )~_, aiw; and 
so (v, y) = > 7, & (v, wi) = 0, whence v L y. 


Proposition 16.2 Let V be an inner product space and let A be a nonempty 
set of nonzero mutually-orthogonal vectors in V . Then A is linearly indepen- 
dent. 


Proof Let {v1,..., Un} be a finite subset of A and assume that there exist scalars 
C1, ...,Cn Such that Yini civi = Oy. Then, for 1 < h < n, we have cp (vp, vh) = 
X; ci (vi, Un) = (X; civi, Un) = (Oy, vh) = 0 and hence cp, = 0. Thus any finite 
subset of A is linearly independent, and therefore A is linearly independent. 


If V is an inner product space then any vector Oy 4 w € V defines a function 


Ty i Ure 


from V to itself, which is in fact a projection the image of which is the subspace 
of V generated by {w}. This easily-checked remark is the basis for the following 
theorem. 
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Proposition 16.3 (Gram-Schmidt Theorem) Any finitely-generated inner 
product space V has a basis composed of mutually-orthogonal vectors. 


Proof We will proceed by induction on dim(V). If dim(V) = 1 the result is imme- 
diate. Therefore, we can assume that the proposition is true for any inner prod- 
uct space of dimension k, and assume that dim(V) = k + 1. Let W be a sub- 
space of V of dimension k. By the induction hypothesis, there exists a basis 


{v1,..., Ux} of W composed of mutually-orthogonal vectors. Let v €e V \ W and 
set Up4] = vV — aa Ty; (v). This vector does not belong to W since v ¢ W. There- 
fore, {v,,..., Ug41} is a generating set for V. Moreover, for 1 < j < k, we have 


fetes Stas 0 


and so vķ+1ı L vj; for all 1 < j < k. By Proposition 5.3, it follows that the set 
{v,,..., Ue } is linearly independent and so is a basis for V. 


We should note that the proof of Proposition 16.3 is an algorithm, called the 
Gram-Schmidt process, which is easy to implement by a computer program to create 
a basis composed of mutually-orthogonal vectors of V, when we are given a basis 
of any sort for the space. 


3 0 3 

0 1 —1 4 
Example Let vı = ol v2 =| 5 j and v3 = 3 be vectors in R*, on 

0 1 2 


which we have defined the dot product. The set {v1, v2, v3} is linearly independent 
and so generates a three-dimensional subspace W of R4. Let us use the Gram- 
Schmidt process to build a basis for W composed of mutually-orthogonal vectors. 


0 
Indeed, we define uj = v1, U2 = v2 — My, (v2) = ; , and u3 = v3 — Ty, (v3) = 
1 
0 
Tu, (V3) = l E then {u1, u2, u3} is a basis for W, the vectors of which are 
5 


mutually orthogonal. 


Example Let V = C(—1, 1), on which we have defined the inner product (f, g) = 
pa f(x)g(x)dx. For all i > 0, let f; be the polynomial function f; : x > xt. Then 
for each n > 0, the set { fo, ..., fn} is linearly independent and so forms a basis for a 
subspace W of V. We now apply the Gram-Schmidt process to this basis, to obtain 
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a basis {po,---, Pn} Of vectors in V which are mutually orthogonal, where the p; 
are precisely the Legendre polynomials we introduced earlier. 


Example If a is a diagonalizable endomorphism of a finitely-generated inner prod- 
uct space V then there exists a basis B composed of eigenvectors of œ. Applying 
the Gram-Schmidt process to B will yield a basis of V composed of mutually- 
orthogonal vectors, but they may no longer be eigenvectors of a. Thus, for exam- 


è m2: n a a+b : _ 1 1 _ 
ple if v= Rita: | | 2b faaite lfa 1 , then the Gram. 


0 


Schmidt process yields r Aa | | , where H is not an eigenvector of œ. 


Actually, the assumption that we have a basis in hand when initiating the Gram— 
Schmidt process is one of convenience rather than necessity. We could begin with 
an arbitrary generating set {v1,..., Vn} for the given space. In that case, at the hth 
stage of the process we would begin by checking whether vy is a linear combination 
of the set of mutually-orthogonal vectors {u1,..., up—1} we have already created. If 
it is, we just discard it and go on to vp+1. 

We should point out that the Gram-Schmidt process is not considered computa- 
tionally stable—small errors and roundoffs in the computational process accumulate 
rapidly and can lead at the end to a significant difference between the true solution 
and the computed solution. There are, fortunately, other more sophisticated meth- 
ods of constructing a basis composed of mutually-orthogonal vectors from a given 
basis. 


Proposition 16.4 (Hadamard inequality) Let n be a positive integer, let A= 
[aij] € Mnxn(R) be a nonsingular matrix, and let e = |A|. Then |e| < g"Jn", 
where g = max{|qa;j|| 1 <i, j <n}. 


Proof Denote the rows of A by v1, ..., Un. Then {v1, ..., Vn} is a basis for V = R” 
and so, using the Gram-Schmidt method, we can find a new basis {u1,..., Un} 
for V, on which we consider the dot product, composed of mutually-orthogonal 
vectors, and defined by setting uw; = vı and un = Vp — D Chju j, where Chj = 


(Up Uj); auj)! If B € Mnxn (R) is the matrix the rows of which are u1, ..., Un, 
1 0 0 his (0) 
C21 1 (0) cwe 0 

then A = CB, where C = c31 c32 ë l -0 |. Since C is a lower- 
Cn)? Cn2 «++  Cn,n=1 1 


triangular matrix, its determinant is the product of the entries on its diagonal, 
namely 1. Therefore e = |B|. 
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By looking at the Gram-Schmidt method, we see that ||u;|| < ||v;|| for all 
1 <i <n. Moreover, since the u; are mutually orthogonal, we see that BBT =D, 
where D = [dj;] is the diagonal matrix defined by dj; = ||u; \| for all 1 <i <n. 
Therefore, e? = |BB"| = |D| = []}_, lluill’. Now let g = max{|qjj| | 1 <i, j < 
n}. Then |lu;|| < lull] < g/n for all 1 <i <n and so |e| < g"Vn", as de- 
sired. 


Let V be an inner product space having a subspace W. Let Wt = {v € V 
(v, w) = 0 for all w € W}. By Proposition 16.1, we know that W+ is a subspace 
of V. Since (v, v) £0 for all Oy Æ v € V, it is clear that W and W+ are disjoint. 
Also, again by Proposition 16.1, we see that vi= {0y} and {Oy}+ = V. The space 
WŁ is called the orthogonal complement of W in V, and this name is justified by 
the following result: 


Proposition 16.5 Let W be a subspace of a finitely-generated inner product 
space V. Then V = W ® Wt and W = (W*)+. Moreover, if Y is another 
subspace of V then WtnYt=(W+Y)t. 


Proof By Proposition 16.3, we know that it is possible to find a basis {v;,..., vx} 
of W which is composed of mutually-orthogonal vectors, and by the construction 
method used in the proof of this proposition, we see that this can be extended to a 
basis {v1, ..., Un} of V, the elements of which are still mutually orthogonal. Thus 
v; € WŁ for all k < i < n, proving that V = W + W+. But we already know that W 
and W+ are disjoint and so we have W ® W+., Moreover, {Uk+1,--+, Un} is a basis 
for W+ and so W =(W")t. 

Now let Y be another subspace of V. If v e WŁ N Y+ then, for each w € Y 
and y € Y we have (v,w + y) = (v,w) + (v, y) = 0 and so v e (W+Y)-. 
Conversely, if v e (W + Y)+ then (v, w) = (v, y) = 0 for all w € W and y € Y, 
sove WŁNn y+, 


In particular, if V is an inner product space having a subspace W then we have a 
natural projection of W œ W+ onto W, called the orthogonal projection. The image 
of a vector v € W @ W+ under this projection is the unique element of W closest to 
v, according to the distance function defined by the inner product on V, in the sense 
of the following theorem. 


Proposition 16.6 Let W be a subspace of an inner product space V and let 
v = w + y, where w € W and y € WŁ. Then ||v — w'|| > ||v — w|| for all 
w’ € W, with equality holding if and only if w' = w. 
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Proof If w’ € W then 
lv- wI? = lw- w + yl? = (w-— w +y, w- w + y) 
=(w— w, w-— w’) + (y, w— w) + (w—w’, y) + (y, y) 
=(w—w',w—w') + (y, y) 


2 2 2 2 
= |w = w'lf + lly = ww" + llv — wl, 


and from here the result follows immediately. 


One of the important problems in computational algebra is the following: Given 
an endomorphism « of a finitely-generated inner product space and a vector Oy # 
vo € V, find an efficient procedure to define an orthogonal projection onto the 
Krylov subspace F[a]vo. One of the first of these is the Arnoldi process, a mod- 
ification of the Gram-Schmidt process. Several variants of this procedure have been 
devised, depending on special properties of œ. This process is not considered as 
computationally efficient as the Lanczos algorithm mentioned earlier. Arnoldi’s pro- 
cess is also the basis for the GMRES algorithm (GMRES = generalized minimal 
residual) for solution of systems of linear equations, devised by Yousef Saad and 
Martin Schultz in 1986. 


© Y. Saad (Saad); © Martin Schultz (Schultz). 


Algerian/American Yousef Saad and American 
Martin H. Schultz are contemporary computer 
scientists. Walter Edward Arnoldi was a twenti- 
eth century American engineer whose career was 
mostly spent with United Aircraft Corporation. 


Note that Proposition 16.5 is not necessarily true if the space V is not finitely 
generated, as the following example shows. 


Example Let V = R, For each h > 0, let vp be the sequence in which the Ath 
entry equals 1 and all other entries equal 0. Then B = {vp | h > 0} is a basis for V 
composed of mutually-orthogonal vectors. Let W = R{vo — v1, vj — v2,...}. This 
subspace of V is proper since v9 € V \ W. If Oy # y € W- then there exists a 
nonnegative integer n such that y = )~’_,a;v;, where the a; are real numbers and 
an ~ 0. But then an = (y, Vn — Un+1) = 0, and that is a contradiction. Therefore, we 
have shown that W+ = {0y}, despite the fact that W 4 V. Moreover, in this case 
V4AWOW! and (W+)t=V4W. 


Let V be an inner product space. A nonempty subset A of V is orthonormal if 
and only if the elements of A are mutually orthogonal, and each of them is normal. 
Thus, for example, the canonical basis of R”, equipped with the dot product, is 
orthonormal. 
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oe Let V = C(—z,7), on which we have an inner product defined by 
(f, g) 1 tS f(x)e(x)dx. Then we have an orthonormal subset { a} U{sin(nx) | 
n> eee |n>l}of V. 


Example Let V be the subspace of RÈ consisting of all functions f for which 
Jo ®© |f(x)|?dx is finite, where the norm is taken with respect to the inner prod- 
uct (f, 8) I Ff (x)g(x) dx defined on V. Let h € V be the function defined by 
1 for 0 <x < 7 
h:x> 4-1 for 5 <x <1, 
O otherwise. 


This function is known as the Haar wavelet. For each j,k € N define the function 
hi € V by setting hj : x +> 2//*h(2/x — k). Then the subset {hj | j,k € N} of V is 
orthonormal. Haar wavelets have important applications in image compression. 


primarily in analysis. 


ge ) The twentieth-century Hungarian mathematician Alfréd Haar worked 


3 


WA 


Proposition 16.7 Every finitely-generated inner product space V has an or- 
thonormal basis. 


Proof By Proposition 16.3, we know that V has a basis {v1, ..., Vn} the elements 
of which are mutually orthogonal. For each 1 <i < n, let w; = ||v; Itv. Then 
each w; is normal and {w ,..., w,} is a basis for V, the elements of which remain 


mutually orthogonal. 


We can modify the Gram-Schmidt method to provide an algorithm for construct- 
ing an orthonormal basis from any given basis of a finitely-generated inner product 
space V, by normalizing each basis element as it is created. This has the added 
advantage of tending to reduce accumulated roundoff and truncation errors. The ex- 
amples after Proposition 16.3 and Proposition 16.6 show that inner product spaces 
which are not finitely generated may have orthonormal bases as well, but this is 
not always true. Making use of the Hausdorff Maximum Principle, it is possible to 
show that every inner product space V has a maximal orthonormal set, which must 
be linearly independent by Proposition 16.2. Such a subset is called a Hilbert subset 
of V. Clearly, a subset A of V is a Hilbert subset if and only if for every Oy Æ y € V 
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there exists a v € A satisfying (v, y) Æ 0. If V is finitely generated then any Hilbert 
subset of V is a basis for V, but this is not necessarily true for inner product spaces 
which are not finitely generated. 


Example Let V be the set of all infinite sequences cg, c1, ... of complex numbers 
satisfying the condition that }>?° |ci| < 00. We have already seen that this is an 
inner product space. For each k > 0, let vg be the sequence co, c1, ... in which 


afl risk 
!— |O otherwise. 


Then {vz | k > 0} is a Hilbert subset of V which is not a basis for V. 


Example Itis, of course, possible that a finitely-generated inner product space may 
have many different orthonormal bases. For example, the canonical basis of R4 is 
orthonormal, as is the basis 


= OoOO 


We can use Proposition 16.7 to verify the assertion made in the previous chapter. 


Proposition 16.8 [f V and W are inner product spaces over the same field 
F, and if V is finitely generated, then ||a|| is finite for all a e Hom(V, W). 


Proof Pick an orthonormal basis {vj,..., Un} for V and let a € Hom(V, W). If 
v e )-7_, a;u; is normal, then 1 = |v]? = (v, v) = 2, a1 Gj (Vi, Vj) = 
XL] lai? and so |a;| < 1 for each 1 <i < n. Therefore, ||a(v)|| < X; lail- 
|x (vj) |] < X; la (v;)||. Thus |||] is finite and less than c = }7"_, læ (vll. 


Example Of course, ||q@|| may be finite even when V is not finitely generated. For 
example, let V = C(O, 1) and define a norm on V by setting || || = max{| f (t) 
0 <t <1}. Let g: [0,1] x [0, 1] — R be a continuous function. Let œ be the en- 
domorphism of V defined by a(f) : ft i g(t, s) f(s)ds. Since g is continuous 
on a closed subset of R2, we note that it is bounded there, say |g(t,5)| < c for all 
0 <t,s < 1. Moreover, | f(s)| < max{|f(t)| |0 <t < 1}=||f|| foralO<s <1, 
and so 


1 
f se.oforas 0 s1} 
0 


Jac) | =max{ 


1 
<ma{ f lt, o|- |f) ds 


osti} sensi 


forall f eV. 
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Proposition 16.9 Let V be an inner product space having an orthonormal 
basis B. If v => <p ayy and w = }_ ¿g byy are vectors in V (where only 
finitely-many of the ay and by are nonzero), then (v, w) = Doyen ayby. 


Proof By the properties of the inner product, we have 


(v, w) = B ayy, > bx) = > > aybx{y, x)= > ayby. 


yeB xeB yeBxeB yeB 


Proposition 16.10 Let V be an inner product space having an orthonormal 
basis B. Then each v € V satisfies v = X ¿g (V, x)x. 


Proof We know that there exist scalars {ax | x € B}, only finitely-many of which 
are nonzero, such that v = ` ,<g axx. Then for each x € B we have (v, x) = 
yep yy X) = Der dy (y, xX) = ay (x, X) = ax, which yields the desired result. 


The coefficients (v, v;) encountered in Proposition 16.10 are called the Fourier 
coefficients of the vector v with respect to the given orthonormal basis. 


Example Consider the vector space C(—1, 1) over R, on which we have the inner 
product (f, g) = f , f(x)g(x)dx. We want to find a polynomial function of de- 
gree at most 3 which most closely approximates the function f : x +> sin(x) on the 
interval [—1, 1]. To do so, consider the subspace V of C(—1, 1) generated by the 
functions p; : x œ> x! for 0 <i <3 and f. Apply the Gram-Schmidt process to 
the basis {po,..., p3, f} of V to get an orthonormal basis {go,...,q3, g}, where 
qo : x => 5341 rae [Bes gaze G= 4); and q3 :x e yG- 2x). 
By Proposition 16.6 and Proposition 16.10, we know that the polynomial function of 
degree at most 3 which most closely approximates the function f is ya I GU)4is 
where the Fourier coefficients (f, gi) are given by 


1 
(f, qo) = f tinda = 0; 
Ga 


1 
(f,q1) = J 4/ ` sin(x)x dx = V6(sin(1) — cos(1)) = 0.738; 
-1 


O BBa WW. P. 
(fran) =f (3 - 5) sinc) x = 0; 
do fee 9). 
(f, a= f G - 1) sin(x) dx 


= 14/14c0s(1) — 9V/14 sin(1) = —0.034. 
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Thus the polynomial function we seek is given by u : x > —0.315x? + 0.998x. 


Proposition 16.11 Let F be R or C and let k and n be positive integers. 
Let A € Mkxn(F) be a matrix the columns of which are linearly independent 
in F*. Then there exist matrices Q € Mkxn(F) and R € Myxn(F) such that 
d) A=QR; 

(2) The columns of Q are orthonormal with respect to the dot product on F k. 
(3) R is nonsingular and upper-triangular. 


Proof Let u1,..., un be the columns of A. Apply the Gram-Schmidt process to 


the set {u1,..., Un} and then normalize each of the resulting vectors to obtain 
an orthonormal set {v1, ..., Un} of vectors in F*. Let Q € Mkxn(F) be the ma- 
trix having columns vj,...,v,. Then, by Proposition 16.10, we see that u; = 


Di- ui - vj)vj for all 1 <i < n, and so A = QR, where R = [rij] € Mnxn (F) 
is given by r;j = uj - vi for all 1 <i, j < n. This matrix is clearly nonsingular. 
Moreover, we note that the Gram-Schmidt process is such that v; is orthogonal 
to uj,...,uj—1 for all 2 < j <n and so rj; = 0 when i > j. Therefore, R is also 
upper-triangular. 


A factorization of a matrix in the form given by Proposition 16.11 is called a 
QR-decomposition. Such decompositions form a basis of many important numerical 
algorithms, and are widely used, for example, in computing eigenvalues of large 
matrices. The use is primarily iterative. If A is an n x n matrix over R or C the 
eigenvalues of which have distinct absolute values and if we can indefinitely perform 
the iteration 


(1) A =4; 
(2) If A; has a QR-decomposition A; = Q; Ri then set A;+1 = R; Q;; then, under 
rather mild conditions on A, the sequence A1, A2,... of matrices tends to an 


upper triangular matrix in which the eigenvalues of A appear in decreasing order 
of absolute value along the diagonal. 


© Walter Gander (Rutishauser); © 
Vera Kublanovskaya (Kublanovskaya); 
© Frank Uhlig (Francis). 
QR-decompositions were de- 
veloped independently by the 
Swiss computer scientist Heinz 
Rutishauser, one of the fa- 
thers of ALGOL, by the Rus- 
sian computer scientist Vera 
Kublanovskaya, and by John G.F. Francis of the British computer manufacturer Fer- 
ranti Ltd. 


One of the major advantages of QR-decompositions is that they are easy to up- 
date. If we are given a decomposition A = QR, and then the matrix A is altered 
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slightly to obtain a matrix A’ by changing a few of its entries, it is relatively easy 
to alter Q and R to get a QR-decomposition for A’. This is important since many 
applications of linear algebra involve solving successive systems of linear equations 
of the form A® X = w™, where AGD and w“* are obtained from A® and w 
by relatively minor modifications, based on data from some external source which 
is periodically updated. 

The following QR-algorithm is used to compute a QR-decomposition of a matrix 
AEM yn(F) with columns u1, ..., Un: 


For i = 1 ton do steps (1)-(3): 


(1) vi = ui; 
(2) For j = 1 toi — 1 setrj; =u; - vj and Uj = vi — F jij; 
(3) Set rij = ||v; || and v; =r} "vi. 
Then Q is the matrix with columns vj,...,v, and R = [r;;]. Note that step 


(3) presupposes that we have already checked that the set of columns of A is lin- 
early independent. If not, then we have to add an initial check to insure that r;; 
is nonzero, before we attempt to invert it. As already noted, the Gram-Schmidt 
method is not numerically stable and hence neither is this algorithm for finding a 
QR-decomposition. It can be modified to produce a somewhat more stable algo- 
rithm by replacing the definition of rj; in step (2) by rji = vi - vj. 

A variant on this algorithm, called the QZ algorithm, has been devised by Moler 
and Stewart to find solutions for generalized eigenvalue problems. 


With kind permission of The MathWorks, Inc. (Moler); 
© Eric de Sturler (Stewart). 

The contemporary American computer scientist 
Cleve Moler, after a distinguished academic ca- 
reer, became chairman and chief scientist of 
MathWorks, the company that developed MAT- 
LAB. G.W. Stewart is a contemporary American 
computer scientist. 


Proposition 16.12 Let V be an inner product space having an orthonormal 
basis B. Then for allv,weV: 


(1) (Parseval’s identity) (v, w) = ven (y, v} (y, w); 
(2) (Bessel’s identity) ||v||? = Leg IY. v) |. 


Proof Parseval’s identity follows from the calculation 


(v, w) = (Do y)y, v) =} w, y) (y, w) = $ Vy, w); 


yeB yeB yeB 


Bessel’s identity derives from this in the special case v = w. 
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With kind permission of the Leibniz-Institut fiir Astrophysik Potsdam. 


Wilhelm Bessel was a nineteenth century astronomer and a friend of 
Gauss; his mathematical work came as a result of his research on plan- 
etary orbits. The French mathematician Marc-Antoine Parseval pub- 
lished only five short papers at the end of the eighteenth century. 


The following results shows that orthogonality can be used to determine the re- 
lation between two different inner products defined on a vector space over R. 


Proposition 16.13 Let V be a vector space over R on which we have de- 
fined two inner products, u; and u2. For i = 1,2, let Y; = {(v,w)E V x V | 
Lj (v, w) = 0}. Then the following conditions are equivalent: 

(1) There exists a positive real number c such that u2 = eui; 

(2) Y=; 

(3) YSN. 


Proof Since it is clear that (1) implies (2), and (2) implies (3), all we have to 
prove is that (3) implies (1). Therefore, assume (3). First, let us consider the 
case dim(V) = 1, i.e., the case in which V = R. Then, for i = 1,2, the scalar 
bi = ui (l, 1) is nonzero. Set d = u2(1, 1)/uı (1, 1). If a,b € R, then u2(a, b) = 
abu2(1, 1) = abd uı (1, 1) = du (a,b) and so, taking c = Jd, we have estab- 
lished (1). Thus we can assume that dim(V) > 2. For each i = 1,2, and each v € V, 
let ||v||; = Vui (v, v) > 0. Without loss of generality, we can assume that there ex- 
ist elements v, w € V and positive real numbers a < b such that ||v||2 = aļlv]li 
and ||w||2 = b||w||1, since otherwise we would immediately have (1). Suppose that 
w = dv for some 04d € R. Then ||w||2 = |d| - |lv|l2 = Idla - ||v||,a|| wl], and so 
a = b, which is contrary to our assumption that a < b. Therefore, we conclude that 
the set {v, w} is linearly independent. Normalizing v and w with respect to py, if 
necessary, we can furthermore assume that ||v||); = 1 = ||w|l1. 

We claim that (v,w) ¢ Yı. Indeed, assume otherwise. Then we have 
wiv + w,v — w) = p(w, v) — 44 (w, w) = |v]? — lwl? = 0, which implies 
(v+w,v—w) € Yı. Therefore, by (3), (v + w, v — w) € Y2. But then we have 
ulv + w,v — w) = lvli — ale = a* — b? e RX {0}, yielding a contradic- 
tion and establishing the claim. Set y = v — lvlu, w)~!w. Then u(y, v) = 
lvlî — ruı(v, w) = 0, where r = lvllĵui w, wt, and so (y,v) € Yj C Yo. 
Since the set {v, w} is linearly independent, we know that y Æ Oy. If we set 
y= IyllT y, then (y’,v) € Yı C Yo and so, as before, w1(y’ + v, y’ — v) = 
Iyl — lull] = 0. Hence (y’ + v, y! — v) € Yı. Thus |ly|l7 + llullf = | — y + 
vli = Irwllt = lolle w, ww? and so Iyl? = lellu w, ww — 
lvllĵ. Since w2(y, v) = 0, we see that |y + lull} = I — y + ull} = lIrwlĝ = 
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lvlflui w, w) | wll3, Thus 


2 4 =2 2 2 
ly = lolt w, w)| lwi- Iv 


-2,2 2 2 2 
| bwli- a lohi 


E 


4 
= [vliln w, w) 


> a? (lvlflui w, w| lwi = lvli) = @7 lly It. 

Since u2(y', v) = 0, this implies that w2(y’ + v, y’ — v) = |Iy'l$ — lull, > a? — 
a? = 0, contradicting (3) and the fact that 21 (y’ + v, y’ — v) = 0. From this con- 
tradiction, we conclude that there can be no elements a and b as above, so there 
must exist a positive real number c such that ||v I3 =cllv I? for each nonzero vector 
v € V. Then for each v, w € V we have 


1 
u2(v, w) = giv + wll — lv — wli2] 


2 
c 
= gle + wli = liv — wll] = ĉu (v, w), 


which proves (1). 


Let V be an inner product space. We have already seen that, for each w € V, 
the function from Vto the field of scalars given by v+> (v, w) belongs to D(V). If 
V is finitely generated, we claim that every element of D(V) is of this form. The 
following result is actually a special case of a much wider, and more complicated, 
theorem. 


Proposition 16.14 (Riesz Representation Theorem) Let V be a finitely- 
generated inner product space. If 5 € D(V) then there exists a unique vector 
y € V satisfying 5(v) = (v, y) forall ve V. 


Proof Let {v1,..., Un} be an orthonormal basis for V and let y = >*"_, ô(v;)vi. 
Then for all 1 < h < n we have 


(Un, y) = (ur Sam) =} 2 5(vi){vn, vi) = 5 (vn), 
i=1 i=1 


and so (v, y) = 6(v) for all v € V. The vector y is unique since if (v, x) = (v, y} for 
all v € V then x = )7"_, (x, vi)vi = X; ô(vi)vi = y, as desired. 
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The twentieth century Hungarian mathematician Frigyes Riesz was 
one of the founders of functional analysis. 


Example Let n > 1 be an integer and let V be the subspace of RÈ consisting of 
all polynomial functions of degree at most n, on which we have an inner product 


defined by (f, g) = ic f (g(t) dt. Let 6 € D(V) be the linear functional defined 
by ô: ft» f(O). By Proposition 16.14, there exists a polynomial function p € V 


satisfying the condition f(0) = f ı f{@p@)dt for all f € V. The function p is 
defined to be }`;_—o pi (0) pi, where p; is the ith Legendre polynomial. 


Proposition 16.15 Let V and W be finitely-generated inner product spaces, 
and let a: V — W be a linear transformation. Then there exists a unique 
linear transformation a* : W — V satisfying the condition (a(v),w) = 
(v,a*(w)) forall ve V andall w€ W. 


Proof Let w be a given vector in W. It is easy to check that the function ô from 
V to F defined by ô: v > (a(v), w) is a linear functional. By Proposition 16.14, 
we know that there exists a unique vector yw € V satisfying ô(v) = (v, Yw) for all 
v € V. Define the function «* : W > V by a*: w > yy. We have to prove that this 
function is indeed a linear transformation. Indeed, if w1, w2 € W then 


(v, a* (wy + w2)) = (a (w), wi + w2) = (aw), wi) + (æ (v), w2) 
= (v, a” (w1)) + (v, w*(w2)) = (v, @* (w1) + a* (w2)), 


and this is true for all v € V, so we have a*(w, + w2) = œ* (w1) + œ* (w2) for all 
w1, w2 € W. If c is a scalar and if w € W then 


(v, a*(cw)) = (a(v), cw) = C(a(v), w) = clu, a*(w)) = (v, ca” (w)) 


for all v € V, and hence œ* (cw) = ca*(w). Thus a* is a linear transformation and, 
since yy is uniquely defined, it is also unique. 


Let V and W be inner product spaces and let a: V —> W be a linear transfor- 
mation. A linear transformation a* : W — V satisfying the condition (@(v), w) = 
(v,a*(w)) for all v € V and w € W is called an adjoint transformation of a. If such 
an adjoint exists, it must be unique. Indeed, assume that a : V > W has adjoints a* 
and a™ and that there exists an element w’ € W satisfying a*(w’) 4 a@* (w). Set 
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v’ =a*(w’) — a* (w). Then (v’, v’) = (v’, a*(w’)) — (v’, a* (w’)) = (a(v’), w) — 
(a(v’), w’) = 0 and so v’ = Oy, which is a contradiction. 

By Proposition 16.15, we know that if V and W are finitely generated then every 
a € Hom(V, W) has an adjoint. 


Proposition 16.16 Let V and W be finitely-generated inner product spaces, 
having orthonormal bases B = {v1,..., Vn} and D = {w,..., we}, respec- 
tively. Leta: V — W be a linear transformation. Then ® g p(a) is the matrix 
A = [aij], where aji = (a(v;), wj) and Bpg(a*) = A”. 


Proof For all 1 <i < n, let a(vj) = iy anjWn. Then for all 1 < j < k we have 
(a(vj), wj) = (K_i 4nj Wh, Wj) = aji and also (a*(w;), vi) = (uj,a*(wj)) = 
(a(vj), wj} = 4;i, as needed. 


Example Itis, of course, possible that a linear transformation between inner product 
spaces can have an adjoint even if the spaces are not finitely generated. For exam- 
ple, let [a, b] be a closed interval on the real line and let V be the vector space of all 
differentiable functions from [a, b] to R. Define an inner product on V by setting 
(fa) = i Ff (x)g(x) dx. This is an inner product space which is not finitely gener- 
ated over R. Let a be the endomorphism of V satisfying a(f) : x => J? e" f(t) dt. 
Then (a(f), g) = (f, a(g)) for all f, g € V, and so &* exists, and equals a. 


Proposition 16.17 Let V, W, and Y be inner product spaces. Let œ and B 
be linear transformations from V to W having adjoints, let ¢ be a linear 
transformation from W to Y having an adjoint, and let c be a scalar. Then: 
(1) (œ + B)* =a* + B*; 

(2) (ca)* = ca*; 

(3) Ga)* = a*o*; 


(4) a** =a. 


Proof (1) For all v € V and all w € W, we have 


(v, (œ + B)*(w)) = (œ + B)(v), w) = (a(v) + B(v), w) 
= (a(v), w) + (B(v), w) = (v, a* (w)} + (v, B*(w)) 
= (v, (a* + B*)(w)), 


and so by the uniqueness of the adjoint we get (a + B)* =a* + p*. 
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(2) For all v € V and all w € W, we have 


(v, (cæ)* (w)) = ((ca)(v), w) = (c(a(v)), w) = c(a(v), w) 
= clu, a*(w)) = (v, Ta* (w)) = (v, (ca*) (w)), 


and so (ca)* = ¢a*. 
(3) For all v € V and all y € Y, we have (v, (€a@)*(y)) = ((fa)(v), y) = 
(a (v), €*(y)) = (v, a*S*(y)), and so (Ga)* = a*¢*. 
(4) For all ve V and all w €e W, we have (w,a**(v)) = (a*(w),v) = 
(v,a*(w)) = (a(v), w) = (w,a(v)), and soa*™* =a. 


If (K, e) is an algebra over a field F, then a function a > a* from K to itself is 
an involution of K if and only if the following additional conditions are satisfied: 
(1) (a+ b)* =a* + b*and (ae b)* = b* e a* foralla,be K; 

(2) a* =a forallae K. 

Note that 0* = (0 + 0)* = 0* + 0* and so 0* = 0. This means that if0 Æa € K then 
0 Æ a*, by (2). If K is unital, then 1 = 1** = (le 1l*)*=1* e 1* = 1 e 1* = 1*. 
If this case, if a is a unit of K then (a~!)*a* = (aa~!)* = 1* = 1 and similarly 
a*(a—!)* = 1, so (a7!)* = (ayl. 

An element a of K is symmetric with respect to x if and only if a* =a. If b € K 
is a unit symmetric with respect to * then it is straightforward to verify that the 
function a > b~!a*b is also an involution of K. 


Example If V is a finitely-generated inner product space, then we see that the func- 
tion œ +> a” is an involution of the F-algebra End(V). Another involution we have 
already seen is the function A > AT of the F -algebra Mnxn(F), for any field F. 
Of course, in the case F = R, the relation between these two involutions can be seen 
from Proposition 16.16. We have also seen that the function A > A” is an involu- 
tion on My xn(C), and its relation to the involution œ +> a” is also immediate from 
Proposition 16.15. 


Example Let F be a field and let (K, e) be an F-algebra. Define an operation © 


on K? by setting H © [s] = bee Then (KŻ, ©) is an F-algebra and the 


; a bj. . ; : 
function pl }q] sa involution of this algebra. 


Proposition 16.18 Let a: V — W be a linear transformation between 
finitely-generated inner product spaces. Then: 

(1) ker(a*) =im(a)~; 
(2) ker(a) = im(a*)~; 
(3) im(a) = ker(a*)-; 
(4) im(a@*) = ker(a)~. 
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Proof (1) We note that 


ker(a*) = {w € W|a*(w) =0y} 
{w ew (v,a*(w)) = 0 for all v € v} 


[w € W | (a(v), w) =0 for all v € V} =im(@@)t. 


(2) This follows from the same argument as (1), replacing a by a”. 
(3) By (1) and Proposition 16.6, we have im(a) = (im(w)+)+ = ker(a*)+. 
(4) This follows from (2) in the way (3) follows from (1). 


Proposition 16.19 Jf a is an endomorphism of a finitely-generated inner 
product space V then null(a) = null(a*). 


Proof By Proposition 6.10 and Proposition 16.18, we see that null(œ) = 
dim(im(a*)+) = dim(V) — dim(im(a*)) = null (æ*). 


Example Proposition 16.19 is not necessarily true for inner product spaces which 
are not finitely generated. For example, let V = R‘°) with the inner prod- 
uct ([aọ, a1, ...], [bo, b1,...]) = panera Let a € End(V) be given by a: 
[ap, a1, ...] > [0, ao, a1, ...]. Then œ* exists and is given by a” : [ag, aj,...] => 
[a1, a2, ...]. Clearly, ker(q) is trivial but ker(a*) is not. 


Proposition 16.20 Let a: V > W be a linear transformation between 
finitely-generated inner product spaces. Then 

(1) Ifa is a monomorphism then a*a is an automorphism of V ; 

(2) Ifa is an epimorphism then aa* is an automorphism of W. 


Proof (1) It suffices to prove that the linear transformation œ*œ is monic. And, 
indeed, if v € V satisfies a*a(v) = Oy then (a(v), a(v)) = (a*a(v), v) = (Oy, v) = 
0 and so a(v) = Ow. Since œ is a monomorphism, v = Oy and so we have shown 
that a*a is monic, as we needed. 

(2) First of all, we will show that w* is a monomorphism. Indeed, if w1, w2 € 
W are vectors satisfying a*(w ,) = œ* (w2) then for all v € V we have (a(v), 
w — w2) = (v, œ«* (w1) — a*(w2)) = 0 and since œ is an epimorphism, we con- 
clude that (w, w1 — w2) = 0 for all w € W. This implies that w; — w2 = Oy 
and so wı = w2, showing that œ* is indeed monic. Now we will show that aa* 
is also monic, which will suffice to prove (2). Indeed, if aa*(w) = Ow then 
(a*(w), a*(w)) = (œa* (w), w) = 0 and so a*(w) = Oy, proving that w = Ow. 
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Proposition 16.21 Let a: V — W be an isomorphism between finitely- 
generated inner produce spaces. Then (a*)—! = (a7!)*. 


Proof Let B = (a—!)*. Then for all v1, v2 € V we have (v1, v2) = (a~!e(v1), v2) = 
(a(v1), B(v2)) = (v1, a*B(v2)) and so a* B(v2) = v2 for all v2 € V, which means 


that *B is the identity map on V. Thus £ = (a*)~!. 


Finally, we mention a few consequences of some of the above results with which 


we will not deal at length, but have extensive and interesting discussions in the 
mathematical literature. 
(1) In an inner product space over R, we can also project onto affine subsets and 


(2 


wm 


not just onto subspaces. Indeed, if V is an inner product space over R then any 
element v of V defines a linear functional 5, € D(V) given by ô, : w > (w, v}. 
If0¢c ER, then êz! (c) is an affine subset of V. Define a function 6, : V > V 
by setting 


O: yt y TRR] 
Ivl? 

Then for all y € V we have 6,6,(y) = c and so we see that 60, (y) € 6 (e), so 
that im(6,) C 65 i (c). Moreover, 6? = 0,. We call the function 6, the projection 
on the affine set 5,1 (c). Such projections have many applications, such as the 
algebraic reconstruction technique (ART), which is very important in comput- 
erized imaging. 

From Proposition 16.5, we see that the rule which assigns to each subspace W of 
a finitely-generated inner product space V the orthogonal projection of V onto 
W is an embedding of the set of all subspaces of V into the algebra End(V). 
This observation has many ramifications, of which we mention but one. Let 
V be a finitely-generated inner product space. For subspaces W and Y of V, 
we can define the gap between W and Y to be g(W, Y) = ||zw — 7y ||, where 
xw and sry are the orthogonal projections of V onto W and Y, respectively. 
This allows us to measure the distance between subspaces of V in a natural 
way. One immediately sees that g(W, Y) = g(Wt, Y+) and that g(W,Y) <1 
for all such W and Y. Since the gap is a distance function—in the sense of 
Proposition 15.12—it turns the set of all subspaces of V into a metric space, 
the topological properties of which can be studied. For example, one can show 
that this space is compact and, as a result, also complete, meaning that every 
sequence W1, W2,... of subspaces of V satisfying lim;, j—>oo g(Wi, Wj) =0 is 
convergent. It also makes sense to talk about continuous families of subspaces of 
V. The analysis of the topological space of all subspaces of a finite-dimensional 
inner product space has proven to be an extremely important tool. 
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Exercises 


Exercise 1006 
Let A= | J € M>x2(R). Does there exist a matrix B € M2x2(R) of the 
cos(t) —sin(t) 
form | sin(t) cos(t) 
considered as elements of the space R*, endowed with the dot product? 


| such that the columns of A + B are orthogonal, when 


Exercise 1007 


3 
Calculate the angle between the vectors | 1 | and 1 | in the space R?, en- 
—2 
dowed with the dot product. 
Exercise 1008 
1 1 
1 -1 
Calculate the angle between the vectors | 1 | and 1 | in the space R5, en- 
2 -1 
1 1 


dowed with the dot product. 


Exercise 1009 

Let n be a positive integer. A matrix A = [ajj] € Mnxn(C) is a complex 
Hadamard matrix if and only if |apj| = 1 for all 1 < h, j < n and any pair of 
distinct rows of A, considered as vectors in C”, is orthogonal. For each n, find 
a complex number d such that A = [d"/] is a complex Hadamard matrix. For 
n = 6, find a complex Hadamard matrix which is not of this form. 


Exercise 1010 

Let A and B be nonempty subsets of R? which satisfy the condition that u x v € 
B whenever u € A and v € B. Is it true that u x w € B+ whenever u € A and 
we Bt. 


Exercise 1011 
Let V = C(0,1) on which we have defined the inner product (f,g) = 
i FS (x) g(x) dx. Calculate || cos(t) ||. 


Exercise 1012 
Let V = M2x2(R) and define an inner product on V by setting 


a a b b ie 
1 42 u bia \\ i 
(a mals AD 


i=l j=1 
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Find the angle between the matrices | ; | and E | . 


Exercise 1013 
Let V be the space R4, together with the dot product. Find a normal vector in V 


1 1 2 

EE —1 1 
which is orthogonal to each of the vectors rele and l 
1 1 3 


Exercise 1014 
Let V = R? on which some inner product is defined. Does there exist a vector 


2 1 2 
Oy Æ v € V which is orthogonal to each of the vectors | 1 |, | —1 |,and | O |? 
3 1 4 


Exercise 1015 
Let f, g € RË be defined by f : x > x and g : x > x? — }. Are f and g orthog- 
onal as elements of C (0, 1)? Are they orthogonal as elements of C (0, 2)? 


Exercise 1016 
Find a real number c such that ||v — w || = c for every orthonormal pair {v, w} of 
vectors in R”, on which the dot product is defined. 


Exercise 1017 

Let n be a positive integer and let c1,...,Cn is a list of real numbers. Let 
{v1, ..., Un} be an orthonormal basis for R”, let d = min{c1, ..., Cn}. For each 
1 <i <n, set di = Jc; —d and let w; = dvi. Let B € Mn xn(R) be the matrix 
the columns of which are d4, ..., d and let A= BB? +dI. For 1 <i <n, show 
that v; is an eigenvector of A associated with the eigenvalue c;. 


Exercise 1018 
Let V = C(0,1) on which we have defined the inner product (f, g) = 
Vie f (x) g(x) dx, and let W = R{e*} C V. Find an infinite set of elements of WŁ. 


Exercise 1019 
Let V = R* on which some inner product is defined. Find distinct vectors 
v, w, y E€ V such that v L w and w L y, but not v L y. 


Exercise 1020 
Let V be an inner product space over R and let v and w be vectors in V. Show 
that ||v|| = ||w|| if and only if (v + w) L (v — w). 
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Exercise 1021 

Let V = C(—1,1) and define an inner product on V by setting (f,g) = 
Ji f(x)e(x)dx. Let W be the subspace of V composed of all even functions. 
Find WŁ. 


Exercise 1022 

Let n be a positive integer and let V = Mnxn(C), on which we have an inner 
product defined by (A, B) = tr(A? B). Let W be the subspace of V consisting of 
all those matrices A € V satisfying tr(A) = 0. Find WŁ. 


Exercise 1023 


Define an inner product on R? with respect to which the vectors |- I and A 


are orthogonal. 


Exercise 1024 
Make use of the Gram-Schmidt process to find an orthonormal basis for the 
space R? together with the dot product, beginning with the initial basis 


1 1 1 
Lj =2 a 2 
1 1 3 


Exercise 1025 

Let V be an inner product space and let A be an orthonormal subset of V. Show 
that A is a maximal orthonormal subset if and only if for every Oy Æ y € V there 
exists a v € A satisfying (v, y) 40. 


Exercise 1026 

Let V be an inner product space of finite dimension n over its field of scalars. 
Show that there exists a subset {v1,..., V2n} of V satisfying the conditions that 
(vi, vj) <0 forall 1 <i # j < 2n. 


Exercise 1027 

Let V be the space of all polynomial functions in RF of degree less than 3, with 
inner product (p,q) = aie p(t)q(t) dt. Find an orthonormal basis { po, p1, p2} 
of V satisfying deg(p;,) = h for h = 0, 1, 2. 


Exercise 1028 
Consider the function u : R? x R? > R given by 


a a’ 


LL: b|,| b > 2ad' +ac' + ca' +bb' + cc’. 


c c 
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Show that jz is an inner product and find a basis of R? orthonormal with respect 
to u. 


Exercise 1029 

Let V be an inner product space over R and let W be a finitely-generated sub- 
space of V with orthonormal basis {w1,..., Wn}. Leta € Hom(V, W) be defined 
by a: v > X; (v, wi)wi. Show that a(v) — v € WŁ for all v € V and that 
llæ(v)— v|| < |w — v|| for all æ (v) Awe W. 


Exercise 1030 
Let W be the subspace of R4 spanned by linearly-independent subset 


ano 


which is an inner product space with respect to the dot product. Make use of the 
Gram-Schmidt process to find an orthonormal basis for W. 


Exercise 1031 

Let n be a positive integer and let A € Mnxn(R). If the set of rows of A is or- 
thonormal with respect to the dot product, is the same true for the set of columns 
of A? 


Exercise 1032 

Let n > k be positive integers and let A € Mkxn(R) satisfy the condition 
that its set of rows is orthonormal with respect to the dot product. Show that 
(ATA)? = ATA. 


Exercise 1033 

Let n be a positive integer and let A € Mnxn (R). Show that A is symmetric if 
and only if for some k < n there exists a matrix B € Mnxk (R) and a real number 
r such that A= BB! +rI and the columns of B are mutually orthogonal. 


Exercise 1034 


Let W be the subspace of R4, which is an inner product space with respect to the 
2 0 4 1 


dot product, generated by o ; i ; i ; l . Find an orthonormal 
0 2 4 1 


basis for W and an orthonormal basis for W+. 
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Exercise 1035 
Consider R4 as an inner product space with respect to the dot product. Add two 


1 1 
vectors to the set į g 31°21 4 in order to get an orthonormal basis for 
=5 1 


this space. 


Exercise 1036 

Let n be a positive integer and let V = R”, which is an inner product space with 
respect to the dot product. Let {v1, ..., Un} be an orthonormal basis for V, let 
a € R, and let 1 < h Æ k <n. Define vectors w1, ..., Wn in V by setting 


cos(a)vp — sin(a)u, ifi=hħ, 


wi = { sin(a)v_, —cos(a)vy ifi=k, 
vi otherwise. 
Is {w1,..., Wn} an orthonormal basis for V? 


Exercise 1037 
Consider R? as an inner product space with respect to the dot product. Is there 


4k 0 —25 
ak € Z such that 4|, ik], 16k is an orthonormal basis for this 
—k 4 —12k 


space? 


Exercise 1038 
Consider R* as an inner product space with respect to the dot product. Find an 


1 1 (0) 
orthonormal basis for R 0 P ! , : 
1 2 1 
(0) 1 2 


Exercise 1039 
Define an inner product on R? by setting 


(l ; Bi =ac + sad + be) + bd. 


Find an orthonormal basis for this space. 


Exercise 1040 
Consider R? as an inner product space with respect to the dot product. Let 
a, b, c, d be nonzero real numbers satisfying the conditions that a? +b? +c? = d? 
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a c 
and ab + ac = bc. Show that the subset 4 4 | b ! c 1 a of R? 
c 


is orthonormal. 


Exercise 1041 
Define an inner product on R? by setting 


d1 bı 
( a |,| b2 ) = aby + 2(a2b2 + a3b3) — (aı1b2 + a2b1) — (a2b3 + a3b2). 
a3 b3 


Find an orthonormal basis for this space. 


Exercise 1042 
Let m be a positive integer and let 


W={feC*| fa +m = f0) foral i eZ}, 


which is a subspace of the vector space CŽ over C. Define a function n : 
W x W — C by setting u : (f, g8) œ> yeo f(h)g(h). For each O < j < m, 
let fj € W be the function defined by 


1 ifhis ofthe form j + mi, fori €Z, 
0 otherwise. 


fi) =| 


Show that u is an inner product on W and that, with respect to that product, 
{ fo, .--, fm—1} is an orthonormal basis for W. 


Exercise 1043 

A function f € RË has bounded support if and only if there exist real numbers 
a <b such that f(x) = 0 for all x not in the interval [a, b] on the real line. Let V 
be the set of all such functions and define a function u : V x V > R by setting 
Uf gZ= Chae Sf (x)g(x) dx. For each k € Z, let fg € V be the function defined 
by 


1 ifk<x<k-+1, 
0 otherwise. 


fax | 


Show that u is an inner product on V and that the subset {fp | k € Z of V is 
orthonormal with respect to this inner product. 


Exercise 1044 

Let V be the subspace of RË consisting of all infinitely-differentiable functions 
f which are periodic of period h > 0. (In other words, f(x + h) = f(x) for all 
x € R.) Define an inner product on V by setting (f, g) = ce Ft (x)g(x) dx. Let 
a be the endomorphism of V which assigns to every element of V its derivative. 
Find a*. 


394 16 Orthogonality 


Exercise 1045 

Let V be an inner product space over R and let {v1, ... Vn} be a set of mutually- 
orthogonal nonzero vectors in V. Let aj,..., a, be positive real numbers sat- 
isfying )>/_, a; = 1, and let w = )~_, ajv;. Suppose that w L v; — vj for all 
1 <i Æj <n. Then show that ||w||-7 = $$] lvl? 


Exercise 1046 
Let V be a finitely-generated inner product space and let œ, 61, B2 € End(V) 
satisfy a*aB, = a*aB2. Show that af; = af. 


Exercise 1047 
If W and Y are subspaces of a finitely-generated inner product space V, show 
that g(W, Y) is the maximum of 


sup{d(w, Y) | w € W and ||w|| = 1} and sup{d(y, W) | y € Y and ||y|| = 1}. 


Exercise 1048 
Let W and Y be subspaces of a finite-dimensional inner product space V satisfy- 
ing g(W, Y) < 1. Show that dim(W) = dim(Y). 


Exercise 1049 

Let K be an algebra over a field F on which we have an involution a > a* 
defined. Let c be an element of K satisfying the conditions that c = c? and that 
c + c* — 1 is a unit of K. Show that there exists an element d € K satisfying 
d? = d = d*, dc = c, and cd = d. Is d necessarily unique? 


Exercise 1050 
Let V be an inner product space over C and let œ be an endomorphism of V 
satisfying &* = —a. Show that every eigenvalue of a is purely imaginary. 


Selfadjoint Endomorphisms 1 7 


Let V be an inner product space. An endomorphism a of V is selfadjoint if and 
only if (a(v), w) = (v,a(w)) for all v, w € V. Such endomorphisms always exist 
since øc is selfadjoint for any c € R. Selfadjoint endomorphisms have important ap- 
plications in mathematical models in physics. For example, in mathematical models 
of quantum theory, selfadjoint operators on the state space of a system represent 
measurements which can be performed on the system. Note that if œ € End(V) 
is selfadjoint, then (a(v), v) = (v,a(v)) = (a(v), v) and so (a@(v), v) € R for all 
veV. 


Example Let V = C(O, 1), which is an inner product space over R in which 
(fig = i f(x)g(x)dx. Then the endomorphism æ of V defined by æ (f) : x => 
ie cos(x — y) f(y) dy for all f € V is selfadjoint. 


Proposition 17.1 Let V be an inner product space. Then: 

(1) Ifa € End(V) has an adjoint a*, then a + a* is selfadjoint; 

(2) Ifa € End(V) is selfadjoint, so is ca for each c € R; 

(3) Ifa € End(V) is selfadjoint, so is a” for each positive integer n; 

(4) Ifa, p € End(V) are selfadjoint so area + B and ae 6, where e is the 
Jordan product in End(V); 

(5) Ifa € End(V) is selfadjoint and B € End(V) has an adjoint, then Bap* 
is selfadjoint. 


Proof (1) If v, w € V then 


((a + a*)(v), w) = (aw), w) + (a*(v), w) 
= (v, a* (w)) + (v, a(w)) = (v, (œ + a*)(w)). 
(2) If v, w € V then ((cæ)(v), w) = c(a (v), w) =c(v, a(w)) = (v, (ca)(w)). 
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(3) This follows by an easy induction argument, using Proposition 16.17(3). 
(4) The selfadjointness of œ + 6 is an immediate consequence of Proposi- 
tion 16.17(1). Also, recall that œ e 6 = 5 (ap + Ba) and so, if v, w € V then 


1 1 
(we B)(v), w) = 5 (0B), w)+ 5 (Bat), w) 


1 1 
= z Ba(w))+ z aß (w)) = (v, (we B)(w)). 


(5) By Proposition 16.17, we have (Baß*)* = B**aB* = Bap*. 


In particular, if œ is a selfadjoint endomorphism of an inner product space V, and 
if p(X) € R[X], then p(q) is selfadjoint. The product of selfadjoint endomorphisms 
of V need not be selfadjoint, as we will see in the example after Proposition 17.4. 
Thus we see that the set of all selfadjoint endomorphisms of V is a subspace, though 
not necessarily a subalgebra, of End(V). 


Example Let V be an inner product space over R and let a € End(V) be selfadjoint. 
Let a, b € R satisfy the condition that a? < 4b. Then, by the previous remark, we 
know that 6 = a? + aa + bo is again a selfadjoint endomorphism of V. Moreover, 
if Oy Ave V then 


(B(v), v) = (a?(v), v) + ala (v), v) + blv, v) 
= (æ (v), a (v)) + a(a(v), v) + b(v, v) 


= jew)? + alat), v) + blvl?. 


By Proposition 15.2, we know that | (æ (v), v)| < |æ (v)|| - ||v|| and so 


(B0), v) = lew? = lal - Jew] - ell + blv? 


l ? 1 2 2 
= (Jew -= 5a lv) + (6-40 Jil >0. 


Thus £ (v) Æ Oy for each Oy Æ v € V, showing that 6 is monic. In particular, if V 
is finitely generated, this in fact shows that 6 is an automorphism of V. 


Let V be a finitely-generated inner product space having an orthonormal basis D. 
If œ € End(V) and if #pp(«) = [a;;], then we know from Proposition 16.16 that 
Ppp(a*) = Ppp(a)"” = [a;;]". Therefore, if a is selfadjoint we have ajj = āji 
for all 1 <i, j < n. In particular, aj; = aj; for all 1 <i <n and so the diagonal 
entries in ®pp(a) belong to R. Matrices A over C satisfying the condition that 
A = A” are known as Hermitian matrices. When we are working over R, these are, 
of course, just the symmetric matrices. It is clear that the sum of Hermitian matrices 
is again a Hermitian matrix, but the product of Hermitian matrices is not necessarily 
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Hermitian, just as we have seen that the product of symmetric matrices is not neces- 
sarily symmetric. We do note, however, that if a matrix A € My x,(C) is Hermitian 
then so is A2. Indeed, if A and B are Hermitian matrices in My xn(C), then their 
Jordan product 5(AB + BA) is a Hermitian matrix and so, in particular, the prod- 
uct of a commuting pair of Hermitian matrices is again Hermitian. Moreover, any 
matrix D € My xn(C) can be written in the form A + iB, where A = 5(D + D#") 
and B = —5(D — D”) are both Hermitian matrices. If A € Myxn(C) is Hermitian, 
then so is cA for any c € R, and so the set of all Hermitian matrices in Mn xn(C) 
is a subspace of Mn xn(C), considered as a vector space over R; indeed, it is a 
subalgebra of the commutative Jordan R-algebra Mnxn(C)*. However, this set is 
not closed under multiplication by complex scalars, and so it is not a vector space 
over C. 

We have already seen that if V is an inner product space and if a € End(V) is 
selfadjoint, then (a(v), v) € R for all v € V. If V is finitely generated, the reverse is 
also true, as follows immediately from the following result. 


Proposition 17.2 Let V be an inner product space over C and let a € End(V) 
have an adjoint. If (a(v), v) € R for all v € V, then a is selfadjoint. 


Proof For vectors v, w € V, we have 
(a(v +w), v + w)) = (a(v), v) + (a(v), w) + (a(w), v) + (a(w), w) 


and since, by assumption, we know that the scalars (æ (v + w), v + w)), (a(v), v), 
and (a(w), w) are all real, we see that (a(v), w) + (a(w), v) € R as well. This 
implies that (a(v), w) + (a(w), v) = (w,a(v)) + (v,a(w)) and so i(a(v), w) + 
i(a(w), v) =i(w,a(v)) +i (v, e(w)). Similarly, 
(a(v +iw), v + iw)) = (a(v), v) — i(a(v), w) + i(a(w), v) + (a(w), w) 
and so —i(a(v), w) +i(a(w), v) € R. This implies that 
—i(a(v), w) + i(a(w), v) = i(w, a(v)) — i(v, a(w)) 


and so, multiplying by i and adding it to the previous result, we get 2(a(v), w) = 
2(w,a(v)), whence (a(v), w) = (w, a(v)). Therefore, a = a*. 


Proposition 17.3 Let V be an inner product space and let oo 4 a € End(V) 
be selfadjoint. Then there exists a vector v € V satisfying (a(v), v) £0. 


Proof First, assume that the field of scalars is C. Then it is easy to check that if 
v, w E V then 
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(a(v), w) = Flet +w), v + w) — (æv — w), v — w)] 


+ Slew + iw), v+ iu) (a(v iw),v iw)]. 


Moreover, each term on the right-hand side of this equality is of the form (æ (y), y) 
for some y € V, so if all of these were equal to 0 we would see that a(v) L w for all 
v, w E€ V, which means that a(v) = Oy for all v € V, contradicting the hypothesis 
that oo # a. Thus the desired result must hold. 

Now assume that the field of scalars is R. Then for all v, w € V we have 
(a(w), v) = (w, a(v)) = (a(v), w) and so 


(a(v), w) = Heo +w), v+ w) — (a(v — w), v — w)]. 


Again, each term on the right-hand side of this equality is of the form (æ (y), y) for 
some y € V so if all of these were all equal to 0 we would have a(v) L w for all 
v, w € V which, as we have seen in the previous case, leads to a contradiction. 


We now return to a new variant of a question we have already posed: If œ is an 
endomorphism of an inner product space V, when does there exist an orthonormal 
basis of V composed of eigenvectors of œ? If such a basis exists, we say that œ is 
orthogonally diagonalizable. 


Example Let V = R? with the dot product, and let a € End(V) be given by 


a b 1 1 0 
5 a Jd. = . ¥ 
æa: | b || |a |. Then Z 1], Ts 1,10 is an orthogonal basis of 
c c 0 0 1 


V composed of eigenvectors of a, so œ is orthogonally diagonalizable. 


Proposition 17.4 Let V be an inner product space and let a € End(V) be 
selfadjoint. Then spec(a) C R and eigenvectors of a associated with distinct 
eigenvalues are orthogonal. 


Proof Let c be an eigenvalue of œ and let v be an eigenvector of œ associated 
with c. Then c(v, v) = (cv, v) = (a(v), v) = (v, a(v)) = (v, cv) = C(v, v) and so, 
since v Æ Oy, we see that c = C, proving that c € R. Thus we have shown the first 
assertion. 

If c and d are distinct eigenvalues of œ associated with eigenvectors v and w, 
respectively, then c(v, w) = (cv, w) = (æ (v), w) = (v,a(w)) = (v,dw) = d (v, w). 
Since c Æ d, this implies that (v, w) = 0. 
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Example The endomorphisms « and £ of C? which are given by æ : Bl = | 


—b 
{i, —i} and so, by Proposition 17.4, Ba is not selfadjoint. 


and £: B = | J can easily be seen to be selfadjoint. However, spec(6a) = 


We note the following consequence of Proposition 17.4: If the matrix 
A € Mnxn(R) is symmetric, then all eigenvalues of A are real, and so the char- 
acteristic polynomial of A is completely reducible in R[X]. 

By Proposition 17.4, we see that if V is an inner-product space of finite dimen- 
sion n over C and if a € End(V) is selfadjoint, then the eigenvalues of a can be 
written uniquely as an n-tuple (cj(@),...,Cn(@)) of real numbers, the entries of 
which form a nonincreasing sequence. If a, 8 € End(V) are selfadjoint, the prob- 
lem of describing the possible sets of eigenvalues of œ + £ in terms of those of œ and 
B is extremely important in particle physics, and is known as Weyl’s Problem. Wey] 
himself showed that if a, 6 € End(V), if 1 <k <i<n,and1 <j <n—i+1, then 
Ci4j—1(@) + Cn—j41(B) < ci (œ + B) < ci-k+1 (œ) + cx (B) and so, if j =k = 1, we 
have cj (a) + cn(B) < ci(œ& + B) < ci(@) + €1(B) for each 1 <i <n. In particular, 
cı (œ + £) < cy (œ) + cı (8) and cn (œ) +¢n(B) < cn (œ + £). Since then, this problem 
has been extensively studied. A solution in terms of probability measures was given 
by the Australian mathematicians Anthony H. Dooley and Norman J. Wildberger, 
together with the Canadian mathematician Joe Repka, in 1993. 

Weyl’s result has many interesting consequences. We note that if V is an inner 
product space and if a € End(V) is selfadjoint and represented with respect to a 
given basis by some matrix A € Mnxn(C) then the diagonal elements of A must 
also be real. Schur proved that if A is a matrix representing such an endomorphism 
a having eigenvalues cı (œ) > --- > cn(œ) and diagonal entries pı > --- > pn then 
Xia Pi < Whi ¢j(@) for all 1<k <n—1 and Di pj = Xi cj(@). The 
converse was proven by the American mathematician Alfred Horn: if c1 > --- > Cn 
and pı >--- > pn are sequences of real numbers satisfying Vint Pj < D Cj 
for all 1 <k <n -— 1 and `} pj =} j=1cj then there exists a selfadjoint 


endomorphism «œ of C” with eigenvalues c1,...,Cn and having an orthonormal 
basis relative to which @ is represented by a matrix having diagonal entries 
Ppi ka Pn 4 


We now turn to the problem of finding the eigenvalues of a selfadjoint endo- 
morphism of a finitely-generated inner product space. This problem arises in many 
important applications. For example, let J” be a (nondirected) graph with vertex 
set {1,..., n}. We associate to this graph a symmetric matrix, called the adjacency 
matrix [a;;], the entries of which are nonnegative integers, by setting a;; to be the 
number of edges in J” connecting vertex i to vertex j. The matrix represents a self- 
adjoint endomorphism of R” with respect to some basis and its spectrum can be 
used to derive important information about I”. This technique has important ap- 
plications in the analysis of computer networks, in the design of error-correcting 
codes, and in such areas as chemistry, where it is used to make rough estimates of 
the electron density distribution of molecules. Another example is the following: If 
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B = {v1,..., Vn} is a set of distinct vectors in R”, and if || - || is a norm on R”, then 
the distance matrix defined by B is the matrix [||v; — v;||]. This matrix is symmet- 
ric and so defines a selfadjoint endomorphism of R”. Computing the eigenvalues 
of such matrices has important applications in many areas, including bioinformatics 
and X-ray crystallography. 


Proposition 17.5 Jf V is a nontrivial finitely-generated inner product space, 
then spec(a) # Ø for any selfadjoint endomorphism a of V. 


Proof Let a be a selfadjoint endomorphism of V. Choose an orthonormal basis 
B = {v1,..., Vn} for V and let A= gpg (æ). Since « is selfadjoint, we know that 
A = A", Let W = C” on which we have defined the dot product. Then the endo- 
morphism $ of W defined by 6: wt> Aw is selfadjoint. The degree of the charac- 
teristic polynomial |X J — A| of £ is n > 0 and so, by the Fundamental Theorem of 
Algebra, it has a root c € C. Thus the matrix c7 — A is singular and so there exists a 
nonzero vector w € W satisfying Aw = cw. In other words, c € spec(f). By Propo- 
sition 17.4, this implies that c € R and so c € spec(a), even if V is an inner product 
space over R. 


In particular, we learn from Proposition 17.5 that every symmetric matrix over 
R has an eigenvalue in R. Compare this to the example we have already seen of a 
symmetric matrix in M2,2(GF(2)) having no eigenvalues. Similarly, the symmetric 


matrix E | E€ M2x2(Q) has no eigenvalues in Q. 


Let V be an inner product space finitely generated over C and let œ be a self- 
adjoint endomorphism of V. We know, by Proposition 17.4, that the eigenvalues of 
a are all real and that eigenvectors of œ associated with distinct eigenvalues are or- 
thogonal. Let us denote the eigenvalues of œ by c1, ...,Cn where the indices are so 
chosen that c1 > --- > cn. An important result known as the Courant—Fischer Min- 
imax Theorem states that, for each 1 < k <n, we have cy = sup{inf{ (æ (w), w) | 
w € W and ||w|| = 1}}, where the supremum runs over all subspaces W of V hav- 
ing dimension k. 

Let us look at this from a different perspective. The function which assigns to 
each Oy Æ v € V the scalar Ry(v) = (v, a(v)) ||v||~* is called the Rayleigh quotient 
function. Note that the projection 7, defined in connection with the Gram-Schmidt 
theorem satisfies the condition that 7, : (v) œ> Ra(v)v. By what we have already 
seen, the image D of this function is contained in R. Moreover, if v is an eigenvector 
of a with associated eigenvalue c, then Ra(v) = c, and so Ø Æ spec(a) C D. On 
the other hand, it is possible to show—though we will not do it here—that D is 
contained in the closed interval [cn, c1] bounded by the largest and the smallest 
eigenvalues of a, both endpoints of which in fact belong to D. This observation 
can be used to define the Rayleigh quotient iterative scheme to find eigenvalues of a 
selfadoint endomorphism a: 
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As an initial guess, choose a normal vector vg and let dọ = Ry (vo). 
For k =0, 1,2, ... repeat the following steps: 
(1) If æ — doi ¢ Aut(V), then dx is an eigenvalue of a, and we are done; 
(2) Otherwise, œ — dga € Aut(V). Set y = (a — dgo,)~! (vk) and then compute 
Vet = || yl 7y and desi = Ra (vk+1). 


This scheme will indeed produce an eigenvalue of œ for all guesses of vo except 
those in a set of measure 0, and when it converges, the convergence is very rapid. Its 
main disadvantage is the time and effort needed in step (1) of the iteration to decide 
if æ — do; is an automorphism of V or not (usually, if the matrix representing this 
endomorphism is nonsingular or not) and, if it is, to compute its inverse; the algo- 
rithm is therefore worthwhile only if this can be done without major computational 
effort. 


With kind permission of the 
Archives of the Mathematisches 
Forschungsinstitut Oberwolfach 
(Fischer, Courant); With kind per- 
mission of the Science Photo Li- 
brary (Strutt). 

The twentieth-century 
German mathematicians 
Ernst Fischer and Richard 
Courant studied spaces of functions. Courant, who headed the Mathematics Institute at the 
University of Gottingen, fled Germany in 1933 and founded a similar institute in New York 
City, which now bears his name. John William Strutt, Lord Rayleigh, was a nineteenth- 
century British physicist and applied mathematician, who made important contributions to 
mathematical physics and who won the Nobel prize in 1904 for his discovery of the inert 
gas argon. 


Example Let a be the endomorphism of RÌ represented with respect to the canoni- 


2 1 1 
cal basis by the symmetric matrix A= | 1 3 1 |. Then « is selfadjoint. Choose 
1 1 4 
1 
vo = 5 1 |. Using the above algorithm, we see that dọ = Ra (vo) = 5, which is 
1 
not an eigenvalue of æ. Moreover, 
0.3841106399... 
vy = | 0.5121475201... and dı =5.213114754..., 
0.7682212801... 
0.3971170680... 
v2 = | 0.5206615990... and dz =5.214319743.... 


0.7557840528 ... 
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The actual value of an eigenvalue of a is 5.214319744..., so we see that conver- 
gence was very rapid indeed. 


Proposition 17.6 Let V be a finitely-generated inner product space and let 
a € End(V). If W is a subspace of V invariant under a, then W+ is invariant 
under a*. 


Proof If w € W and ye W. Then a(w) € W and so (w,a*(y)) = (a(w), y) =0, 
whence a*(y) € Wt. 


If V is a nontrivial inner product space finitely generated over R and assume that 
a € End(V) is orthogonally diagonalizable. Then there exists an orthonormal basis 
B = {v1,..., Un} composed of eigenvectors of a. Thus ®g 2 (a) is a diagonal matrix 
and so symmetric. In particular, ®g3(a*) = Og pla) = ®gp(a), which proves 
that œ = a* and so a is selfadjoint. The converse of this result follows from the 
following proposition. 


Proposition 17.7 Let V be a nontrivial finitely-generated inner product 
space and let a € End(V) be selfadjoint. Then a is orthogonally diagonal- 
izable. The converse holds if the field of scalars is R. 


Proof We will prove the result by induction on n = dim(V). For n = 1, we know 
by Proposition 17.5 that æ has an eigenvector v € V, and so {vı} is the desired 
basis, where vı = ||v||~!v. Now assume that n > 1 and that the proposition has been 
established for all spaces of dimension less than n. Pick vı as before and let W be 
the subspace of V generated by {v1}. Then V = W © W- and, by Proposition 17.6, 
we know that W+ is invariant under a* = a. Moreover, W+ is an inner product 
space of dimension n — 1 and the restriction of œ to W~ is selfadjoint. Therefore, 


by the induction hypothesis, there exists an orthonormal basis {v2,..., Un} of wt 
composed of eigenvectors of a. Since vı is orthogonal to each of the vectors in this 
basis, we see that {v,,..., Vn} is an orthonormal basis of V. 


Now assume that the field of scalars is R and that a € End(V) is orthogonally di- 
agonalizable. Then there exists an orthonormal basis D of V composed of eigenvec- 
tors of a. This means that ®pp(a) is a diagonal matrix, which is surely symmetric, 
and so by Proposition 16.16 we see that a is selfadjoint. 


Example The converse part of Proposition 17.7 is not true if the field of scalars 
is C. Indeed, consider the endomorphism a of C? represented with respect to the 


canonical basis by the matrix f f | The characteristic polynomial of œ is X?, 


so were it diagonalizable, it would have to be equal to oo. 
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Let V be an inner product space. An endomorphism œ € End(V) is positive defi- 
nite (resp., positive semidefinite) if and only if it is selfadjoint and satisfies the condi- 
tion that (a(v), v) is a positive (resp., nonnegative) real number for all Oy Ave V. 
If there exist Oy Æ v, w € V satisfying (a(v), v) > 0 > (a(w), w), then « is indefi- 
nite. 

We see that oc is positive definite for any positive real number c. We also note 
that a positive-definite endomorphism must be monic since if æ (v) = Oy implies that 
(a(v), v} = 0 and so v = Oy. Therefore, every positive-definite endomorphism of a 
finitely-generated inner product space is in fact an automorphism. Positive definite 
endomorphisms have important applications in optimization and linear program- 
ming.! 


Example Let D = {z € C ||z| < 1}. If z1,..., Zn are distinct complex numbers in D, 
and if we are given complex numbers w1,..., Wn in D, one can ask if there ex- 
ists an analytic function f : D— D satisfying f(z;) = w; for all 1 <i <n. The 
Nevanlinna—Pick Interpolation Theorem states that such a function exists if and only 
if the matrix [a;;] in which aj; = (1 — wiw) (1 — ziz)! for all 1 < i, j < n, rep- 
resents a positive-definite endomorphism of C”. This theorem has been generalized 
considerably in many directions. 


Rolf Nevanlinna was a twentieth-century Finnish 
mathematician who worked mostly in analysis. 
Georg Pick was a twentieth-century Austrian earth 
geometer, who was a good friend of Einstein. 


Note that if a is positive definite and if Oy Æ v € V then 0 < (a(v), v) = læa w)l|- 
||v|| cos(t), where ¢ is the angle between «œ (v) and v, showing that 0 < t < 5- 


Example Let V = R” on which we have the dot product defined, and let B be 
the canonical basis. An endomorphism @ of V is positive definite if and only if 
A = ®pgz(a@) is a symmetric matrix satisfying the condition that vT Av > 0 for all 
nonzero vectors v € V. Such matrices have nice properties. For example, it can be 
shown that if A is of this form then the Gauss—Seidel method applied to an equation 
AX = w will converge to the unique solution v, for any initial guess vo chosen. If 
œ is positive definite, then the norm on V defined by ||v|| 4 = Vv! Av is called an 
elliptic norm. Any norm on V can be reasonably approximated by an elliptic norm, 
a fact of importance in numerical analysis. 


! Systems of linear equations defined by positive-definite endomorphisms of R” first appear in 
Gauss’ work on least-squares approximation, which we will consider in a later chapter. 
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Example As an immediate consequence of the observation in the previous example, 
we see that the endomorphism « of R? defined by a : i a he H á satisfies the 


condition that (æ (v), v) > 0 for all Oy Æ v € V, but is not selfadjoint and so is not 


positive definite since gpg (œ) = is not symmetric. 


1 1 
0 1 
Example Even if og 4 œ € End(V) is selfadjoint, it may be the case that neither a 


nor —a is positive definite. For example, if œ € End(R7) is defined by a : a| 


[l 
[heelal [o] (i)i 


Example The endomorphism a : B = Į p | of R? is selfadjoint and, for any 
_|a 
=1,/> 


v we check that (œ (v), v) = (a + b)? > 0, so a is positive semidefinite. On 


the other and, if v = | i then (a@(v), v) = 0, so @ is not positive definite. Since 


-1 
i i ; ; 1 ; 

a is represented with respect to the canonical basis by the matrix 11b this also 

shows that in order for an endomorphism to be positive definite, it is not sufficient 

that it be represented by a matrix all of the entries of which are positive. 


Let V an inner product space. If a, B € End(V) are selfadjoint, then œ — £ is also 
selfadjoint. We write œ > 6 whenever a — f is positive definite. Thus, œ is positive 
definite if and only if a > oo. We write a > £ if and only if a > $ or œ = p. We 
claim this is a partial-order relation on the set of all selfadjoint endomorphisms of V. 
Indeed, it is sure that œ > @ for all such endomorphisms œ. Suppose that a1, œ2, and 
a3 are selfadjoint endomorphisms of V satisfying a1 > a2 > a3. If a] = a2 or a2 = 
a3 then it is clear that a; > a3. Let us therefore assume that a] > a2 > a3. Then 
for all v € V we see that ((œ1 — a@3)(v), v} = (a1 (v) — œz (v) + a2(v) — @3(v), v) = 
(a1 (v) — a2(v), v) + (a2(v) — a3(v), v) > 0 and so a; > a3. Finally, assume that 
a) > a2 and a2 > a, but a; Æ a2. Then a; > a2 > a and so, as we have seen, 
a, > a1, which is a contradiction. Thus we have a partial order on the set of all 
selfadjoint endomorphisms of V, called the Loewner partial order. 


The Czech mathematician Karl Loewner emigrated to the United 
States in 1933. His research concentrated in complex function theory 
and spaces of functions. 
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Example Let V be a finite-dimensional inner product space. Inequalities of the form 
i aja; > 00, where the œ; are in End(V) and the a; are scalars, play an impor- 
tant part in control theory, and have been studied extensively. 


Proposition 17.8 Let V be an inner product space and let a € End(V) be an 
endomorphism for which a* exists. Then a is positive definite if and only if 
the function n : (v, w) œ> (a(v), w) from V x V to the field of scalars is also 
an inner product. 


Proof First, let us assume that œ is a positive-definite endomorphism of V. If 
v1, v2, w E V then u(vı + v2, w) = (æ (vı + v2), w) = (æ (v1), w) + (æ (v2), w) = 
u(viı, w) + u(v2, w) and, similarly, we show that u(cv, w) = cu(v, w) for all 
scalars c. We also see that u(v, w) = (a(v), w) = (v,a*(w)) = (v,a(w)) = 
(a (w), v) = (w, v). If Oy Æ v € V then, by the assumption of positive definite- 
ness, we see that u(v, v) = (æ (v), v) is a positive real number, and it is clear that 
(Ov, Oy) = 0. Thus n is an inner product on V. 

Conversely, assume that u is an inner product on V. Then for all v, w € V we 
have (v, a*(w)) = (æ (v), w) = u (v, w) = u (w, v) = (a (w), v) = (v, æ (w)) and so 
a(w) =a*(w) for all w € V, proving that a is selfadjoint. Moreover, for all v € V 
we have (a(v), v) = u (v, v) for all Oy Æ v € V and so a is positive definite. 


Proposition 17.9 Let V be an inner product space, with a given inner product 
(v, w) b> (v, w), and let u be another inner product defined on V . Then there 
exists a unique positive-definite endomorphism a of V satisfying the condition 
that u(v, w) = (a (v), w) forall v, w E€ V. 


Proof Fix a vector w € V. The function v > u(v, w) belongs to D(V) and so there 
exists a unique vector y, € V satisfying u(v, w) = (v, yw) for all v e V. Define 
a function & : V —> V bya: w> yw. Then (a(v), w) = (w,a(v)) = u (w, v) = 
u(v, w) for all v, w € V. We claim that œ € End(V). Indeed, if w1, w2 € V then for 
all y € V we have 


[a (w1 + w2), y) = (w1 + w2, y) = u (w1, y) + u (w2, y) 
= (a (w1), y) + (a (w2), y) = (a (w1) + a (w2), y) 


and so a (w1 + w2) = a (w1) + æ (w2). Similarly, we can show that œ (cw) = ca (w) 
for all w € V and all scalars c. Thus we see that œ is indeed an endomorphism of V 
satisfying the condition u (v, w) = (æ (v), w) for all v, w € V, and so it is positive 
definite. 

Finally, œ has to be unique since if u(v, w) = (£ (v), w) for all v, w € V, then 
((a — B)(v), w) = (a (v) — B(v), w) = (a (v), w) — (B(v), w) =0 for all v, w € V, 
which implies that (a — 6)(v) = Oy for all v € V, showing that a = £. 
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Proposition 17.10 Let V be a finitely-generated inner product space and let 
a €End(V). Then a is positive definite if and only if there exists an automor- 


phism B of V satisfying a = B*B. 


Proof Assume that there exists an automorphism 6 of V satisfying a = B*£. 
Then, as previously noted, œ is selfadjoint. Moreover, for all Oy 4 v € V we have 
(a(v), v) = (B*B(v), v) = (B(v), B**(v)) = (B(v), B(v)) > 0 since £ is an auto- 
morphism and hence f(v) # Oy. Therefore, œ is positive definite. 

Conversely, assume that a is a positive-definite endomorphism of V. Then the 
function u : (v, w) + (a(v), w) is an inner product on V. Let {v1,..., vn} be a 
basis for V which is orthonormal with respect to the original inner product on V 
and let {w1,..., Wn} be a basis for V which is orthonormal with respect to jw. 
By Proposition 6.2, we know that there exists a unique endomorphism $ of V 
satisfying B(w;) = v; for all 1 <i <n. Then £ is an epimorphism since its im- 
age contains a basis for V and so, since V is finitely-generated, it is an automor- 
phism of V. Therefore, if v = })j_, aiw; and w = ));_; bj wj are vectors in V we 
see that 


(a(v), w) = u(v, w) = (Sram, Yo) 


i=l 


= Sy ju, v= Law i 


i=l j=1 


and similarly 


(B*B(v), w) = (80), sw={6 (Ean) (om) 


n n 


n 
= Dd aid; (wi, wj) =>} aibi, 
i=1 


i=l j=1 


and so we see that (6*8(v), w) = (a(v), w) for all v, w € V, which shows that 


a= p*B. 


From Proposition 17.10 we know that if A = [ajj] € Mnxn(C) is a matrix repre- 
senting a positive-definite endomorphism of C”, namely if it is a Hermitian matrix 
satisfying the condition that v - Av > 0 for all nonzero vectors v € C”, then there 
exists a nonsingular matrix B such that A = B” B. Indeed, we can choose B to 
be upper triangular, so that it is a form of LU-decomposition, though it takes only 
half as many arithmetic operations to perform. This decomposition is known as a 
Cholesky decomposition of A. This decomposition need not be unique. Cholesky 
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decompositions are widely used in building economic and financial models. Be- 
cause of this wide usage, there are many algorithms available to efficiently calculate 
Cholesky decompositions of general matrices or of matrices in special forms. In- 
deed, one of the computational advantages of the Cholesky decomposition is that it 
is numerically stable, even with no pivoting. On the other hand, if you change even 
one element of A you have to recompute the Cholesky decomposition of the new 
matrix from scratch. 


With kind permission of the Collections Ecole polytechnique (SABIX). 


Major André-Louis Cholesky was a cartographer in the French army, 
who used this method in connection with the mapping of the island of 
Crete before World War I. It had previously been used by other cartog- 
raphers, including Myrick H. Doolittle, of the computing division of 
the US Coast and Geodetic Survey, in 1878. A mathematical formula- 
tion had been given earlier by Toeplitz. 


The following algorithm calculates a Cholesky decomposition for real symmetric 
matrices. 


For k=1,...,n perform the following steps: 
(1) Foreach 1 <i < k define bi, = b; [aik — Di bjibjxls 
(2) Set bix = ark — ja Oe: 
(3) Foreach k <i <n set big =O. 


Note that if the matrix A did not satisfy v - Av > 0 for all nonzero vectors v, the 
algorithm would hang up at some stage, trying to take the square root of a negative 
number. Indeed, attempting a Cholesky decomposition is often used as a test to see 
whether a given matrix represents a positive-definite endomorphism or not. 


52. 3 
Example LetA=|2 1 1 |€ M3x3(R). This is a symmetric matrix satisfying 
3 1 4 


the condition that v - Av > 0 for all nonzero vectors v € R? and having a Cholesky 
5V5 275 3/5 
decomposition B T B, where B= 1 0 s5 =/5 1; 


5 
0 0 5/2 
Notice that the Proposition 17.10 extends our ongoing analogy between the op- 
eration * and the conjugate operation on C, just as the notion of “positive definite” 
is the analog of positivity of complex numbers: a complex number z is (real and) 
positive if and only if there exists a complex number y such that z = yy. 


Cholesky decompositions do not work for Hermitian matrices representing indef- 
inite endomorphisms of C”. In such cases, one has to make use of other methods, 
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such as the Bunch—Kaufman algorithm, which is quite effective for sparse matri- 
ces. 


Proposition 17.11 Let V be an inner product space. If œ € End(V) is positive 
definite, then every eigenvalue of a is a positive real number. The converse 
holds if œ is orthogonally diagonalizable. 


Proof Assume that « is positive define. By Proposition 17.4, the eigenvalues of a 
are real numbers. If c € spec(@) is an eigenvalue of œ associated with an eigen- 
vector v, then 0 < (a(v), v) = (cv, v) = c({v, v) and so c > 0, since we know 
that (v, v) > 0. Conversely, assume that every eigenvalue of œ is positive and that 
there exists an orthonormal basis B of V composed of eigenvectors of a. Let 
v= yas where {v1,..., Un} C B. For each 1 <i <n, let c; be an eigen- 
value of a associated with v;. We can assume that the v; are arranged in such a 
manner that 0 < cy < co < --- < cn. Then 


(a(v), v) SS eas Sain) ye (Civi, vj) 


i=1 j=l i=l j=l 


So (vi, vj) = Salai zay lmh >0, 


f= 1, 9=1 i=1 i=l 


and so @ is positive definite. 


From Propositions 17.11 and 17.7, we see that if V is an finitely-generated in- 
ner product space over R or C and if a € End(V) is positive definite, then there 
exists a basis of V relative to which q@ is represented by a diagonal matrix in which 
the entries of the diagonal are positive real numbers. Such a matrix is, of course, 
nonsingular. 


Proposition 17.12 Let V and W be inner-product spaces finitely-generated 
over R and let a € Hom(V, W). Then ||a|| = /c, where c is the largest eigen- 
value of a*a € End(V), and where ||a|| is the norm induced by the respective 
inner products on V and W. 


Proof If c is an eigenvalue of 6 = a*a then there exists a nonzero vector v such that 
B(v) = cv and so cljv||? = (v, cv) = (v, a*a(v)) = (a(v), a(v)) > 0, and so c > 0. 
By Proposition 17.7, we know that there exists a basis {v1,..., Un} of V composed 
of orthonormal eigenvectors of 6. For each 1 <i < n, let c; be an eigenvalue of 
B associated with v;. After renumbering, we can assume that 0 < c1 < -++ < cn. If 
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v=} ;_] aiv; € V, then 


n 


lew |? = (v, B@)) =(v, a*a w)) = a! cia; (vj, vi) 


j=li=1 
n 

2 2 

= Deal se (Xa) =cut 
i=l 


and so ||a(v)||7/|lv||? < cn. Therefore, by definition of the induced norm, ||æl| < 
Jn. But one easily sees that ||æ(va)||?/llvn||7 = cn and so ./q < |la||, proving 
equality. 


Example Let æ : R? —> R? be the linear transformation defined by a: v > Av, 


1 -2 1 
where A=| 3 0 1 


10 -2 -2 
—2 4 —2 |v. The eigenvalues of this endomorphism are 0 < 8 — 2/2 < 
—2 —2 2 


8 + 2/2 and so ||æl| = V8 + 2V2, which is approximately equal to 3.291. 


| Then a*q is the endomorphism of R? given by v > 


Let V and W be inner product spaces. A linear transformation a : V — W pre- 
serves inner products if and only if (v1, v2) = (a(v1), æ(v2)) for all vı, v2 € V. 
Notice that any linear transformation which preserves inner products also pre- 
serves distances: ||v; — v2|| = |æ (vı — v2)|| = ||e(v1) — æ (v2)|| for all v1, v2 E€ V. 
Also, as a direct consequence of the definition, such a linear transformation pre- 
serves the angles between vectors. Conversely, we have already noted that from the 
norm defined by an inner product we can recover the inner product itself, so that 
any linear transformation a : V > W satisfying ||vı — v2|| = ||æ (v1) — æ (v2)|| for 
all vı, v2 also preserves inner products. Such a linear transformation is called an 
isometry. 


Proposition 17.13 Let V be an inner product space over C and let a € 
End(V) be an isometry. Then the eigenvalues of a lie on the unit circle 
{zEC| |z| = 1}. 


Proof If c is an eigenvalue of œ with associated eigenvector v, then ||v||* = ||cu||* = 
cl? ||v||? and so |c] = 1. 


Example Let V be an inner product space over R and let Oy Æ y € V. This vector 
y defines an endomorphism ay of V by setting 


wy) 
(y, y) 


dy: v> —v+2 
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This endomorphism is an isometry which satisfies a? = 01, and y is a fixed point 
of ay. 


Proposition 17.14 Let V and W be finitely-generated inner product spaces 

and having equal dimensions. Then the following conditions on a linear trans- 

formation a: V — W are equivalent: 

(1) @ is an isometry; 

(2) @ is an isomorphism which is an isometry; 

(3) If {vi, ..., Un} is an orthonormal basis of V then the set {a(v1),...,@(Un)} 
is an orthonormal basis of W. 


Proof (1) => (2): If Oy Æ v € V then (v, v) = (a(v), a(v)), and so a(v) Æ Ow. 
Thus we see that ker(~) = {0y } and so @ is an isomorphism, since V and W have 
the same finite dimension. 

(2) => (3): If {v1, ..., Vn} is an orthonormal basis of V then, since œ is an isomor- 
phism, we see that {a(v1),..., @(Up)} is a basis for W. Moreover, for all 1 <i, j <n 
we know that 
1 wheni = j, 

0 otherwise, 


(a(v;), w(vj)) = (vi, vj) = | 


and so this basis is orthonormal. 
(3) = (1): Let {v1,..., Un} be an orthonormal basis of V. If v = yy, av; and 
y=} j= bjv;, then (v, y) = } `; aibi. Moreover, 


= >>> aid j(a(vi), «(vj)) = J aibi = (v, y), 
i=l 


i=l j=l 


and this proves (1). 


In particular, if V and W are finitely-generated inner product spaces having equal 
dimensions, then every isometry œ : V —> W is an isomorphism. If w1, w2 € W, 
then (w1, w2) = (aT! (w1), xaT! (w2)) = (a7! (w1), &7!(w2)) and so we see that 
œ! is also an isometry. Moreover, there is always at least one isometry œ from V 
to W. Just pick orthonormal bases {v1,..., vn} for V and {w1,..., wn} for W and 


define a by a: X; aivi > OP liwi. 


Example The endomorphism of R? represented with respect to the canonical basis 
V3 V2 -I 
by a V3 —J/2 1 | is an isometry. 
Jo 
0 V2 2 
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Example Let W be the set of all matrices A € M3,3(R) satisfying AT = —A, 
which is a subspace of M3,3(R) of dimension 3. Define an inner product on W 
as follows: if A, B € W then (A, B) = 4tr(AB’). Let V = R?, which is an in- 
ner product space with respect to the dot product. Define a linear transformation 


a 0 =c b 0 =c b 
a: V — W by setting «œ : | b | => c 0 —a |.IfA= c 0 —a 
c —b a 0 —b a 0 
-f e cf + be —bd —dc 
andB=| f 0 —d | then ABT = ea cf tad ec and so 
—e d 0 —af fb be+ad 
a 
we can check that (A, B) = | b |- | e | and thus a is an isometry, and hence is an 
c 
f 


isomorphism. 


Example Proposition 17.14 is no longer true if we remove the condition that the 
spaces are finitely generated. Indeed, let V = C(O, 1), on which we have the inner 
product u(f, g) = i f (x)g(x)x? dx, and let W be the same space on which we 
have the inner product (f, g) = i, ft (x)g(x) dx. Leta: V —> W be the linear trans- 
formation defined by a: f (x) œ> xf (x). Then w(f, g) = (a(f), a(g)) and so @ is 
an isometry. But œ is not an isomorphism since the function x +> x? + 1 does not 
belong to the image of a. 


Let us now return to the case of inner product spaces the dimensions of which 
are not necessarily equal. 


Proposition 17.15 Let V and W be inner product spaces finitely-generated 
over R and let a € Hom(V, W). Then o is an isometry if and only if o*o = 
o; € End(V). 


Proof By Proposition 16.15, a* exists. If a*a@ =o, € End(V), and if vy, v2 € V 
then 


2 
lvi — vail” = (v1 — v2, vı — v2) = {vı — v2, &*a (vı — v2)) 


2 
’ 


= (æ (vı — v2), «(v1 — v2)) = ||æ (v1) — æ (v2) 


and so ||vy — v2|| = |æ (vı) — æ (v2)||, proving that œ is an isometry. Conversely, if 
œ is an isometry and if v1, v2 € V then (a@*a(vj), v2) = (æ (vı), a(v2)) = (v1, v2). 
Therefore, by Proposition 16.14, we see that w*a(v1) = vı forall vı € V, soa*a = 
o1 € End(V). 
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Exercises 


Exercise 1051 
Let V = C[X] and define an inner product on V by setting 


oo o0 oo 
(Sax! Saxi) = X aibi. 
i=0 i=0 i=0 


Let a be the endomorphism of V defined by a : p(X) (X + 1) p(X). Calculate 
a*, or show that it does not exist. 


Exercise 1052 
Let V = C[X] and define an inner product on V by setting 


oo oo oo 
(Dax! Saxi) = X aibi. 
i=0 i=0 i=0 


Let 6 be the endomorphism of V defined by £ : p(X) +> p(X + 1). Calculate 
B*, or show that it does not exist. 


Exercise 1053 

Let p > 1 be an integer, let G = Z/(p), and let V = C%, which is an inner 
product space over C with inner product defined by (f, g) = X eg f(m)g(n). 
Let a be the endomorphism of V defined by a(f):nbB f(n+1)+ f(n-— 1). 
Is œ selfadjoint? 


Exercise 1054 

Let V be a vector space over R. A nonempty subset K of V is convex if and 
only if cv + (1 —c)w € K whenever v, w € K and 0 < c < 1. Is the set of all 
selfadjoint endomorphisms of an inner product space Y over R necessarily a 
convex subset of the vector space End(Y)? 


Exercise 1055 
Let V be an inner product space and let œ be an endomorphism of V. Is the 
endomorphism a*a — o, of V selfadjoint? 


Exercise 1056 

Let n be a positive integer and let V be the space of all polynomial functions 
in R® of degree at most n. Define an inner product on V by setting (f, g) = 
T f (t)g(t) dt. Let w € End(V) be defined by a(f): x > (1 — x3) f" (x) — 
2x f'(x). Show that «æ is selfadjoint. 


Exercise 1057 
Let V be an inner product space finitely generated over C and let a be an endo- 
morphism of V satisfying wa* = a*. Show that a is selfadjoint. 
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Exercise 1058 
Let V be an inner product space finitely generated over C and let œ and £ be 
selfadjoint endomorphisms of V satisfying the condition that af is a projection. 


Is Ba necessarily also a projection? 


Exercise 1059 
Give an example of nonzero Hermitian matrices A and B satisfying AB = O = 


BA, or show that no such matrices exist. 


Exercise 1060 

Let A € M2x2(C) be Hermitian. Find real numbers w, x, y, and z satisfying 
|A| = w? — x? — y? — 22, 

Exercise 1061 

Let O £ A € M3,.3(C) be a Hermitian matrix. Show that A‘ Æ O for all positive 


integers k. 


Exercise 1062 


a 0 b 
Find complex numbers a and b such that | O 2a a | € M3 x3(C) is a Hermi- 
i 1 a 


tian matrix. 


Exercise 1063 
Determine all Hermitian matrices A € Ms5x5(C) satisfying A5 +24? +3A = 61. 


Exercise 1064 
A matrix A € Mp xn(C) is anti-Hermitian if and only if A = —A. Show that A 
is anti-Hermitian if and only if i A is Hermitian. 


Exercise 1065 
If matrices A, B € My xn(C) are anti-Hermitian, show that the Lie product of A 
and B is also anti-Hermitian. 


Exercise 1066 
Let n be a positive integer and let A € Mnxn(C). Show that every eigenvalue of 
A" A is a positive real number. 


Exercise 1067 
Let V be an inner product space and let a € End(V) be selfadjoint. Show that 
ker(a) = ker(a") for all h > 1. 
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Exercise 1068 

Let V be a nontrivial finitely-generated inner product space and let a € End(V) 
be orthogonally diagonalizable and satisfy the condition that each of its eigen- 
values is real. Is œ necessarily selfadjoint? 


Exercise 1069 
Let œ € End(R?) be represented with respect to the canonical basis by the matrix 


1 2 3 
a 4 |. For which values of a is a positive definite? 

4 5 

Exercise 1070 

For each complex number z, let œ; be the endomorphism of C? represented with 


1 1 -1 
respect to the canonical basis by 1 1 z |. Does there exist a z for which 
=). 7z 1 


this endomorphism is positive definite? 


Exercise 1071 
Let V be an inner product space and let a € End(V) be positive definite. Is œ? 
necessarily positive definite? 


Exercise 1072 
Let «œ be a positive definite automorphism of an inner product space V. Is a7! 
necessarily positive definite? 


Exercise 1073 
Do there exist a, b, c, d € R such that the endomorphism of R4 represented with 


1 lao 
i ; 1 1 bj, ni ; 
respect to some basis by the matrix c 11138 positive definite? 
O d 1 1 


Exercise 1074 

Let V be an inner product space finitely generated over R and let a € End(V). 
Let D be a fixed basis for V and let A = ®pp(a@). Recall that we can write 
A = B + C, where B = (A +A’) is symmetric and C = (A — A’) is skew 
symmetric. Let 6, y € End(V) satisfy B = #pp(f) and C = ®pp(y). Show 
that a is positive definite if and only if y is positive definite. 


Exercise 1075 

Let œ be a positive semidefinite endomorphism of R”, represented with respect 
to the canonical basis {v1,..., Vn} by asymmetric matrix A = [aj;j] E€ Mnxn (R). 
Show that |ajj| < 5 (aii + aj;) for all 1 <i, j <n. 
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Exercise 1076 


al 
Let U be the set of all vectors |€ R® satisfying the condition that 
a6 
d ag az 
a? d4 as | 1s positive semidefinite. Is U a convex subset of R°? 
a3 a5 6 


Exercise 1077 


Do there exist real numbers a, b, c,d such that the matrix is 


Ca Fe 
Qe ee 
= = =. A 
pee SF © 


positive semidefinite? 


Exercise 1078 

A selfadjoint endomorphism « of R” is almost positive semidefinite if and only 
ay 

if a(v) - v > 0 for all nonzero vectors v= | : | satisfying }~_, a; = 0. Give 
an 

an example of an endomorphism of R? which is almost positive semidefinite but 

not positive semidefinite. 


Exercise 1079 


Let k and n be positive integers. A symmetric matrix in Mk+n,k+n(R) is 
T 


A C 
ing a positive-definite endomorphism of R* with respect to the canonical basis, 
and C is a matrix representing a positive-definite endomorphism of R” with re- 
spect to the canonical basis. Show that a quasidefinite matrix is nonsingular, and 
that its inverse is again quasidefinite. 


quasidefinite when it is of the form | | where B is a matrix represent- 


Exercise 1080 

Let V = R? together with the dot product. Find positive-definite endomorphisms 
a and £ of V satisfying the condition that their Jordan product is not positive 
definite. 


Exercise 1081 
Let V be an inner product space over R and let a € End(V). Show that œ is 
positive definite if and only if œ + a* is positive definite. 


Exercise 1082 

Let V be an inner product space finitely generated over C and let a and £ be 
positive-definite endomorphisms of V satisfying wf = oo. Is it necessarily true 
that œ = oo or f = o0? 
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Exercise 1083 


ces 


Let V = a,b,c eR $} and let W = R?, both of which together with the 


c 
dot product, are inner product spaces of dimension 3 over R. Find an isomor- 
phism a : V —> W which is also an isometry. 


Exercise 1084 
Let V be an inner product space and let œ be an endomorphism of V which is an 
isometry. Does a also preserve angles between vectors? 


Exercise 1085 

Let œ be a positive-definite endomorphism of a finite-dimensional inner prod- 
uct space V represented with respect to some fixed basis by an n x n matrix 
A= [aij]. Show that |A| < Ti Gii. 


Exercise 1086 

Let n be a positive integer and let a be a positive-definite endomorphism of 
C” represented with respect to the canonical basis by the matrix A = [aij] € 
Mnxn (C). Show that aj; is a positive real number for all 1 <i <n. 


Exercise 1087 


a 

Let a : R? — R? be the linear transformation defined by a: | b | => f E o 
c 

Calculate spec(aa*) and spec(a*q). 


Exercise 1088 

Let n be a positive integer and let V = R”, on which we have defined the dot 
product. Let «œ be a positive-definite endomorphism of V represented with respect 
to the canonical basis by the matrix A = [aij] € Mnxn(R). Show that |A] > 0. 
Is it necessarily true that tr(A) > 0? 


Exercise 1089 
Find endomorphisms a, 6 € End(C?) satisfying œ > $ (in the sense of Loewner) 
but not a? > p?. 


Exercise 1090 
Let V be an inner product space and let a, 6 € End(V) be positive definite. Is it 
necessarily true that a + B > B? 


Exercise 1091 
Let V and W be inner product spaces finitely generated over C. Let 
a € Hom(V, W), 0 € Aut(V), and g € Aut(W), where the automorphisms 0 and 
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ọ are positive definite. Show that the automorphism 6 — a* ga of V is positive 
definite if and only if the automorphism y — a@a* of W is positive definite. 


Exercise 1092 

Let œ € End(R”) be represented with respect to the canonical basis by a symmet- 
ric matrix A. Show that (a(v), v) > 0 for all nonzero vectors v € R” if and only 
if æ + coy is positive definite for every positive real number c. 


Exercise 1093 

Let V be the vector space of all infinitely-differential functions in RË, on which 
we define the inner product ( f, g) = in tS (x)g(x) dx. Let W be the subspace of 
all functions f € V satisfying f(0) = f(x) =0. Show that the endomorphism 
of W defined by ft f” is selfadjoint. 


Exercise 1094 
Let V be a finitely-generated inner product space and let a € End(V) be selfad- 
joint. Show that ||a(v)|| < ||v|| for all ve V. 


Exercise 1095 

Let V be an inner product space finitely generated over R of dimension greater 
than 1, and let œ be a selfadjoint endomorphism of V. Show that there are eigen- 
values c < d of æ satisfying cl|v||* < (a(v), v) < d||v||? for all v € V. 


Exercise 1096 

Let V be an inner product space finitely generated over R and let a, 8 € End(V) 
be selfadjoint. Assume that the eigenvalues of œ all lie in the interval [a, b] on 
the real line and that the eigenvalues of £ all lie in the interval [c, d] on the real 
line. Show that the eigenvalues of œ + 6 all lie in the interval [a + c, b + d] on 
the real line. 


Exercise 1097 
Let V be an inner product space finitely generated over C and let a be a positive- 
definite selfadjoint automorphism of V. Show that ((a + a~!)(v), v) > 2(v, v) 
forall ve V. 


Exercise 1098 

Let V be an inner product space finitely generated over R and let a be a positive- 
definite selfadjoint automorphism of V. Show that (a7! (v), v) = max{2(v, w) — 
(a(w), w) |w € W} forall ve V. 


Exercise 1099 

Let V be a finite-dimensional inner product space over C and let a Æ oj positive- 
definite endomorphism of V. Show that there exists no positive integer p satis- 
fying wa? =o}. 
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Exercise 1100 

Let V be a vector space finitely-generated over R and let œ € End(V) be selfad- 
joint. Show that at least one of the values +||a|| is an eigenvalue of œ and any 
eigenvalue c of « satisfies —||a|| < c < |la||. 


Exercise 1101 


Let A= E l € M2 x2(C) be Hermitian and let r > s be the (necessarily 


real) eigenvalues of A. Show that |b| < Fr — s). 


Exercise 1102 

Let V be a nontrivial finitely-generated inner product space and let œ and 6 be 
selfadjoint endomorphisms of V satisfying a6 = Ba. Show that a and 6 have a 
common eigenvector. 


Exercise 1103 

Let n be a positive integer. An endomorphism « of R” is copositive if and only if 
it is selfadjoint and satisfies the condition that æ (v) - v is a positive real number 
whenever v is a nonzero vector all components of which are nonnegative. Clearly, 
positive-definite endomorphisms are copositive. Give an example of a copositive 
endomorphism which is not positive definite. 


Exercise 1104 
Is the endomorphism of C? represented with respect to the canonical basis by the 
4 2-i 1+i 
matrix | 2+i7 3 0 € M3x.3(C) positive definite? 
1-i 0 2 


Exercise 1105 


Let œ be the endomorphism of R? defined by setting « : H = E | 


Show that a is positive definite by constructing an endomorphism £ of R? satis- 
fying a = p*B. 


Exercise 1106 
Find selfadjoint automorphisms œ and f of R? satisfying the condition that 


a > p > —a but æ $ /p*B. 


Exercise 1107 

Let w € End(R”) be represented with respect to the canonical basis by a symmet- 
ric matrix A = [a;;]. Let 8 € End(R”) be represented by the matrix B = [e“/]. If 
a is positive semidefinite, is 6 necessarily positive semidefinite? Is 6 necessarily 
positive definite? 


Unitary and Normal Endomorphisms 1 8 


Let V be an inner product space. An automorphism of V which is an isometry is 
called a unitary automorphism. It is easy to see that if æ and £ are unitary automor- 
phisms of V then wf and a! are also unitary automorphisms of V. It is also clear 
that oj is unitary. Therefore, the set of all unitary automorphisms of V is a group of 
automorphisms. 


Proposition 18.1 Let V be an inner product space and let a € Aut(V) have 


an adjoint. Then æ is unitary if and only if a* = a7'. 


Proof If œ is unitary then (a(v), w) = (æ (v), «a7! (w)) = (v,a7!(w)) for all 
v,w € V and so a* = a~!. Conversely, if a* = a~! then (a(v),a(w)) = 
(v, a*a(w)) = (v, w) for all v, w € V and so g is unitary. 


As a direct consequence of Proposition 17.14, we see that if V is an inner product 
space finitely generated over its field of scalars then for œ € End(V) the following 
conditions are equivalent: 

(1) @ is an isometry; 
(2) @ is unitary; 
(3) œ maps an orthonormal basis of V to an orthonormal basis of V. 

If V is an inner product space finitely generated over its field of scalars F, and 
if æ is a unitary automorphism of V represented by a matrix A = [aij] E€ Mnxn (F) 
with respect to a given orthonormal basis, then we see that AT! = AË € Mnxn (F). 
A matrix of this form over F is called a unitary matrix. If A is a unitary matrix 
then so is AT! since (A7!)# = (A#)~!. Also, if A and B are unitary matrices 
then (AB)~! = B-!A47! = B” A = (AB)? so AB is also unitary. The converse 
-1 1 


is false. For example, the matrix A = | 01 


| € M2x2(R) is not unitary, but 


A? = Tis. 
Thus we see that the set of unitary matrices in My, »(F) define a group of au- 
tomorphisms of F” and so an equivalence relation ~ defined by the condition that 
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A ~ B if and only if there exists a unitary matrix P such that A = P~!BP. Matrices 
equivalent in this sense are unitarily similar. As an immediate consequence of the 
definition, we see that A is unitary if and only if the set of columns (resp., rows) of 
A is an orthonormal basis of F” (resp., Mx» (F')) endowed with the dot product. 


Proposition 18.2 Let n be a positive integer and let A = [aij] and B = [bj;] 


be unitarily-similar matrices in My xn(C). Then 


n n 


n n 
XYY lag by. 
1 


i=l j= i=l j=1 


Proof We note that )`;-1 7; laij? = tr(A” A). If P is a unitary matrix sat- 
isfying B = P~'AP then tr(B” B) = tr(P7! A” P P'AP) = tr(P~'A# AP) = 


tr(A4 APT! P) = tr(A” A), and we are done. 


Example If c,d € C satisfy the condition that |c|* + |d|? = 1, then the matrix 
E | € Mox.2(C) is unitary. A matrix of this form is known as a Givens ro- 
tation matrix. More generally, if n > 3 then a matrix A = [aij] € Maxn(©) is a 
Givens rotation matrix if and only if there exist integers 1 < h < k < n and nonzero 
complex numbers c and d satisfying |c|* + |d|? = 1 such that 


c ifi=je{h,k}, 
1 ifi = j ¢ {h,k}, 

aj=}4d ifi = h and j =k, 
—d ifi=kandj=h, 
0 otherwise. 


These matrices play important roles in numerical algorithms. 


© Walter Gander. 


> James Wallace Givens, a former assistant to von Neumann and con- 
sidered one of the fathers of the twentieth-century American numerical 
analysis, made major contributions to numerical matrix computation. 


Example The matrix A = 5 i A i 


important applications in the modeling of quantum computing, where it is often 


| € M2 x2(C) is unitary. This matrix has 
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denoted by V NOT, since A* = k | represents the negation operator in this 


context. 


/ 2 . 
Example It is easy to show that Ap = | i pg € M2x2(C) satisfies 


Ap AT = I for any real number b, but, except for the case of b = 0, it is not unitary. 


Unitarily-similar matrices are surely similar, but the converse is not true. 


—2 0 0 2 
tarily similar, as we can see from Proposition 18.2. 


Example The matrices | 3 | and | Ld | in M2x2(R) are similar but not uni- 


Proposition 18.3 (Schur’s Theorem) Zf n is a positive integer, then every 
matrix in Myxn(C) is unitarily similar to an upper-triangular matrix. 


Proof We will proceed by induction on n. For n = 1, the result is trivial since every 
1 x 1 matrix is upper triangular. Assume now that n > 1 and that the result has 
been established for M n—1)x(n-1) (C). Let A = [aij] E€ Mnxn(C). Since we are 
working over C, we know that the characteristic polynomial of A is completely 
reducible, and so A has an eigenvalue, call it cı. Corresponding to that eigenvalue, 


dı 

we have a normal eigenvector vj = | : | in which we can assume that d; € R. 
dn 

We now are able to construct a basis {v1, ..., Un} for C” to which we can apply the 


Gram-Schmidt procedure, and thus assume that it is in fact an orthonormal basis 

(the vector vı does not change, since it was assumed to be normal to begin with). 

The matrix Pı, the columns of which are these vectors, is therefore unitary. Now 
cl 


0 
set Aj = Py AP. It is easy to see that the first column of A, is of the form 


0 

so we can write A, in block form as H , where A2 E€ M{n-1)x(n-1) (0). 
2 

By the induction hypothesis, there is a unitary matrix Q € M(n-1)x(n-1) (C) such 


that Q~'A2Q is an upper-triangular matrix. Now set P2 = | | Then P> is 


O 
O Q 


a unitary matrix in Mnxn(C) and Py! P, AP P = F aai o) is an upper 
2 


triangular matrix in My »(C). Since Pı P2 is again unitary, we are done. 
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If we are working over R, then a matrix A representing a unitary automorphism 
of R” satisfies A~! = AT. Such a matrix is called an orthogonal matrix. It is clear 
that the matrix 7 is orthogonal and that A! is orthogonal whenever A is orthogonal. 
If A and B are orthogonal matrices then (AB)! = B-!A7! = BT AT = (AB)? 
and so AB is also orthogonal. As an immediate consequence of the definition, we 
see that A is orthogonal if and only if the set of columns (resp., rows) of A is an 
orthonormal basis of R” (resp., M1 xn»(R)) endowed with the dot product. It is also 
clear that A is orthogonal if and only if AT is orthogonal. 


Example Permutation matrices, which we considered earlier, are clearly orthogonal. 

: cos(t)  sin(t) cos(t) sin(t) 
Example The matrices E sits) coats) and ny cost) 
for every t € R, and one can show that these are the only orthogonal matrices in 


Mo? x2(R). Indeed, suppose that the matrix E -l € M2x2(R) is orthogonal. 
21 an2 


| are orthogonal 


Then aî, + a?, == a + a2, so —1 < aj, < 1. Hence there exists a real number 
t such that aj; = cos(t). Then aî, =1— aî = 1 — cos? (t) = sin? (t) and so a12 = 
+sin(t). Also, aj1 = cos(—t) and sin(—t) = — sin(t). Thus, replacing t by —t if 
necessary we can assume that a}; = cos(t) and a12 = sin(t). Similarly, there exists 
an angle s such that a22 = cos(s) and a21 = sin(s). Matrices of the first type are just 
Givens rotation matrices; matrices of the second type are known as Jacobi reflection 
matrices. 

Since 0 = a11421 + a12a22 = cos(t) sin(s) + sin(t) cos(s) = sin(t + s), we see 
thatt+s=Oort+s=a. If t+s=0, we obtain A = coat) eee 
—sin(t) cos(f) 
cos(f) sin(t) 


t +s =x, then s = x — t and so A = ae —cos(t) 


| since sin(t) = sin(z — t) 


and — cos(t) = cos(z — t). 
One can also show that every orthogonal matrix in M3x3(R) is similar to a 
cos(t) sin(t) 0 
matrix of the form | —sin(t) cos(t) 0 | for some t € R. More generally, if 
0 0 +1 
n > 2 then every orthogonal matrix in Mnxn(R) is similar to a matrix in block 
form [D;;], where Dj; = O if i # j and Dj; is either 1, —1, or a 2 x 2 matrix of the 
| cos(t) a 
form : 


—sin(t) cos(f) 


Example Let n be a positive integer. If 0 < c < 1 is a real number, the matrix 


I —J/1l—c)I 
l (ve) ( e) € Manx2n(R) is orthogonal, where J denotes the 


(V1l—e)l (VOI 


identity matrix in Mn xn (R). 


Example Let n be a positive integer and let V = R”, on which we have defined 
the dot product. If v € V is a normal vector, then the matrix A = J — 2(v ^A v) isa 
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Householder matrix. These matrices are clearly symmetric. Moreover, if A = I — 
2(vAv) then AT A = A? = (I —2[vAv])? = I —4v(v" v)v? +4v(vT v)v!? = I and 
so A is orthogonal. Householder matrices have important uses in numerical analysis. 
We should also mention that if u ¢ v are vectors in V satisfying ||u|| = ||v||, then 
the vector w = ||v — u||7! (v — u) defines a Householder matrix A = J — 2(w A w) 
satisfying Au = Av. Since a Householder matrix is totally determined by one vector, 
it is easy to store in a computer. One of the important uses of Householder matrices is 
to compute QR-decompositions of matrices in a manner far more stable numerically 
than via the use of the Gram-Schmidt method. 


Alston Householder, a twentieth-century American mathematician, 
was among the pioneer researchers of the numerical analysis of ma- 
trices using computers, who developed many of the basic algorithms 
used in this field. 


The complex analog of Householder matrices are matrices of the form J — 2ww”, 
where w € C”. Such matrices are Hermitian and unitary and, too, have an important 
role in numerical computation. 


Example A general method for the construction of orthogonal matrices, due to the 
contemporary American mathematician George W. Soules, is given as follows: Let 
n > | be an integer and let w; € R” be a normal vector all of the entries of which are 


all positive. Let 1 < k <n and write w; = H where u € Ré and v € R”—. Set 


au 


lvi 
u —a`!v 


a= tall and w2 = | | . Then it is easy to see that w2 is normal and orthogonal 


to w,. Moreover, by further partitioning the vectors au and —a~!v, we can even- 


tually construct a mutually-orthogonal normal vectors w1, w2,..., Wn. The matrix 
with these vectors as columns is then orthogonal. 


Notice that if F is either R or C, and if A € Myxn(F) is a unitary ma- 
trix the columns of which are vj,..., Un, then the identity AA” =I implies that 
{v,,..., Un} is an orthonormal set of vectors in F”, on which we have the dot prod- 
uct, and hence it is a basis for this space. Conversely, if {v,,..., Vn} is an orthonor- 
mal basis of F” then the matrix the columns of which are these vectors is unitary. 
Similarly, a matrix in Mnxn(R) is orthogonal if and only if the set of its columns 
forms an orthonormal basis for R” with the dot product. Another way of putting this 
is that a matrix in M,,.,(R) the columns of which are vj,..., v, is orthogonal if 
and only if 07, v; Av =I. 
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Proposition 18.4 Let V be an inner product space of finite dimension n 
over R. Let a be a unitary automorphism of V, which is represented by a 
matrix A E€ Mnxn(R) with respect to a given orthonormal basis of V. Then 
|A|=+1. 


Proof We know that if a is represented by A = [a;;] with respect to the given basis, 
then a* is represented by A7 . From Proposition 18.1, we deduce that AA? = I and 
so |A|? = |A| - |A?| = |Z| = 1, which in turn implies that |A| = 1. 


Example The converse of Proposition 18.4 is false, even for matrices the columns 
0.25 0 


of which are orthogonal. Thus, the matrix | 0 4 


| has determinant 1, but does 


not represent a unitary automorphism of R?. 


The orthogonal matrices in Mnxn(R) having determinant equal to 1 are known 
as the special orthogonal matrices, and the set of all such matrices is denoted by 
SO(n). This subset of My n(R) is clearly closed under taking products as well as 
taking inverses, since if A € SO(n) then |A~!| = |AT| = |A| = 1. If A € Mnxn R) 
is a special orthogonal matrix, where n is an odd integer, then 1 € spec(A). To see 
this, we note that |A — 17| =|A—J|=|A—AA?|=|A|-|2—A™|=|1—A7|= 
|Z — A| and, since n is odd, |Z — A| = (—1)”|A — I |. Thus we must have |A — Z| = 0, 
and so 1 € spec(A). 


Example We have already noted that the only orthogonal matrices in M2x2(R) are 
cos(t) sin(t) cos(t) sin(t) 

—sin(t) cos(f) sin(t) —cos(f) 

of the first type are special, whereas matrices of the second type are not. 


of the form | for some t € R. Matrices 


0 -1 00 0 
1 0 00 0 
Example The matrix | 0 0 —1 0 0 | €Mg5,x5(R) is special orthogonal. 
0 oO 0 1 0 
0 0 00 -i 


Let V be an inner product space. An endomorphism a € End(V) is normal if 
and only if a@* exists and satisfies a*a = æ&œ*. From this definition, it is clear that œ 
is normal if and only if œ* is normal. Clearly, selfadjoint endomorphisms of V are 
normal, as are unitary automorphisms. 


Example If a,b € R satisfy b 4 0 and a? + b? ¥ 1, then the automorphism a of R? 


defined by [s] = p d is normal but neither unitary nor selfadjoint. 
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Example If 0 Æa, b € R then the automorphism a of C? defined by 


Bier 


is normal but neither unitary and nor selfadjoint. 


Proposition 18.5 Let V be an inner product space. An endomorphism 
a €End(V) for which a* exists is normal if and only if \|a(v)|| = ||a*(v)|| 
forallve V. 


Proof If œ is normal and v € V, then ||a(v)||? = (a(v),a(v)) = w, a*a(v)) 
(v,aa*(v)) = (v,a**a*(v)) = (a*(v),a*(v)) = |la*(v)||? and so lor(v) || = 
||~*(v) ||. Conversely, assume that this condition holds. Then for each v € V we have 


((@a* —a*a)(v), v) = (aa*(v), v) — (a*a(v), v) 


= (a*(v), a* (v)) — (a (v), w(v)) = 0. 


But aa* — a*a is selfadjoint and so, by Proposition 17.3, we see that 
aa* — a*a = 09, and so aa* =a*a. 


As a consequence of Proposition 18.5 we see that if œ is a normal endomor- 
phism of an inner product space V and if v € V then v € ker(a) } |la(v)|| = 0 
|a*(v)|| = 0 > v € ker(a*) and so ker(a) = ker(a*). 

We now take a short look at the extensive theory of eigenvalues of normal 
endomorphisms of inner product spaces. We will restrict our attention to finite- 
dimensional spaces, since the theory for infinite-dimensional spaces requires ad- 
ditional topological assumptions. 


With kind permission of the American Mathematical Society. 


The study of eigenvalues of normal and selfadjoint endomorphisms of 
t, inner product spaces was developed simultaneously by the American 
x mathematician Marshall Stone and by John von Neumann, inspired 


by problems in quantum theory. 
Proposition 18.6 Let V be an inner product space and let a € End(V) be 
normal. Then every eigenvector of a is also an eigenvector of a* and if c is 
an eigenvalue of œ then C is an eigenvalue of a*. 
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Proof If v € V then, as we have noted, ||@(v)|| = ||w*(v)||. For a scalar c, we see 
that 


(œ — co1)*(a@ — coy) = (a@* — C01) (œ — co1) = (œ — c01)(a@* — C01) 


= (a — c01) (œ — coy)", 


and so œ — co, is also normal. Thus, ||(@ — coy)(v)|| = ||(@* — €o1)(v)|| for v € V 
and so, in particular, we see that v € ker(@ — coy) if and only if v € ker(a* — Co). 
In other words, v is an eigenvector of œ associated with the eigenvalue c if and only 
if it is an eigenvector of a* associated with the eigenvalue c. 


Since a** = a for any endomorphism « of V, we see from Proposition 18.6 that 
if œ is normal then a scalar c is an eigenvalue of «œ if and only if € is an eigenvalue 
of a*. 

Another interesting consequence of Proposition 18.6 is the following: Let V 
be a finitely-generated inner product space and let a € Aut(V) be unitary. Then 
a is surely normal. If c € spec(a@) then c 4 0 since œ is an automorphism. If v 
is an eigenvector associated with c then v = (a*a)(v) = a*(cv) = ca*(v) and so 
a*(v) = c7!v. This shows that c~! is an eigenvalue of w* and hence, by Proposi- 
tion 18.6, c7! € spec(a@). 


Example In Proposition 17.5, we saw that if œ is a selfadjoint endomorphism of an 
inner product space V finitely generated over R, then spec(a) 4 Ø. This is not nec- 
essarily true for normal endomorphisms of inner product spaces which are not self- 
adjoint. For example, let V = RÊ? together with the dot product, and if œ € End(V) 


b 
can easily check that œ is normal but not selfadjoint. 


is defined by a: | = E , then we have already seen that spec(a) = @. One 


Proposition 18.7 Let V be an inner product space finitely generated over C 
and let a € End(V). Then a is normal if and only if it is orthogonally diago- 
nalizable. 


Proof Assume that œ is normal. We will proceed by induction on n = dim(V). 
First, assume that n = 1. Since we are working over C, we know that spec(a) # 
© and so there exists a normal eigenvector vı of a. Then V = Cv, and we are 
done. Now assume that n > 1 and that the result has been proven for subspaces of 
dimension n — 1. Again, there exists a normal eigenvector vı of a. Set W = Cv. 
The subspace W of V is invariant under œ, and so, by Proposition 18.6, it is also 
invariant under œ*. Therefore, W- is invariant under a** = a. The restriction of 
a to WŁ is a normal endomorphism, the adjoint of which is the restriction of a* 
to Wt. By induction, we know that there exists an orthonormal basis {v2,..., Un} 
composed of eigenvectors of œ, and so {v1,..., Un} is the basis of V we are seeking. 
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Conversely, assume that there exists an orthonormal basis of B = {v1,..., Un} 
composed of eigenvectors of a. Then gz (a) = [a;;] is a diagonal matrix satisfying 
the condition that each a;; is an eigenvalue of œ. Moreover, gz (a*) = Ogg (a)# 
and this too is a diagonal matrix. Since diagonal matrices commute, we see that 
aa* = a*a, and so @ is normal. 


Note that Proposition 18.7 does not imply that if V is an inner product space 
finitely generated over C and if æ € End(V) is normal, then every basis B of V 
composed of eigenvectors of œ is necessarily orthonormal or that its elements are 
even necessarily mutually orthogonal, merely that one such basis exists. 


Example Let a be the endomorphism of C4 represented with respect to the canon- 


12 0 0 
. : À —2 1 0 0 ; H 
ical basis by the matrix A = 003 al One easily checks that AA” = 
0 0 1 3 


AĦ¥ A, and so « is a normal automorphism of C4. The characteristic polynomial of 
A is 


X* — 8X? +27X? — 50X + 50 = (X? — 6X + 10)(X* — 2X +5), 


and so spec(a) = {3 +i, 1 + 27}. The set 


—i i 0 
1 1 1 j1 1 0 1 
J2| 9 PAI 2} 1 

0 0 —i 


m m OO 


is an orthonormal basis for C4 composed of eigenvectors of a. 


Proposition 18.8 Let V be a finitely-generated inner product space. Then the 
following conditions on a projection a € End(V) are equivalent: 

(1) @ is normal; 

(2) a is selfadjoint; 

(3) ker(a) = im(a)+. 


Proof (1) = (2): From (1) we know that ||@(v)|| = ||@*(v)|| for all v € V. In par- 
ticular, æ (v) = Oy if and only if w*(v) = Oy so that ker(a) = ker(a*). If v € V and 
w = v — a (v) then a(w) = a (v) — œ? (v) = a (v) — a (v) = Oy and so a*(w) = Oy. 
Therefore, a*(v) = a*a(v) for all v € V, whence a* = a*a. This implies that 
a = q** = (a*a)* =a*a** = a*a = a*, which proves (2). 

(2) > (3): If v, w € V then, from (2), we see that (œ (v), w) = (v,a(w)). In 
particular, if v € ker(œ) then (v,a(w)) = 0 for all w € V, which is to say that 
v €im(a)+. Conversely, if v € im(w)+ then (v, æ(w)) = 0 for all w € V, which 
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implies that w(v) is orthogonal to every element of V. Therefore, a(v) = Oy and so 
v € ker(a). This proves (3). 

(3) > (1): Let v, w € V. Since « is a projection, we have v — a(v) € ker(a). It 
is also clear that a(w) € im(a@). Therefore, 


0O= (v —a(v), a(w)) = (v, a(w)) — (a(v), a(w)) = (v, a(w)) — (v, a*a(w)), 


and since this is true for all v, w € V, we have œ = a*a. This implies that a = a” is 
selfadjoint and therefore surely normal, proving (1). 


We note that if V is a finitely-generated inner product space and if aw € End(V) 
is normal, then, by Propositions 16.5, 16.7 and 18.5, we have V = ker(a) @ im(q), 
and, in particular, {im(@), ker(œ)} is an independent set of subspaces of V. More- 
over, v | v’ for all v € ker(a) and v’ € im(a). While the direct-sum decomposition 
is valid for any projection, it is the normality which ensures the orthogonality. 


Proposition 18.9 Let V be a finitely-generated inner product space. Let 

Wi,..., Wn be subspaces of V and, for each 1 <i <n, let a; be the projec- 

tion of V onto the subspace W; coming from the decomposition V = W; ® we. 

Then the following conditions are equivalent: 

O) V = Qi- Wi and W} = @ zn Wj for all 1 < h < n; 

(2) a1 +--+ +n = 0; and aja; = 09 for all i F j; 

(3) If Bi is an orthonormal basis of W; for each i, then B = Uia B; is an 
orthonormal basis of V. 


Proof This has essentially already been established when we talked about the de- 
composition of a vector space into a direct sum of subspaces. 


Proposition 18.10 Let F be either R or C and let V be a finitely-generated 
inner product space over F. If p(X) € F[X] and if a is a normal endomor- 
phism of V , then p(a) is a normal endomorphism of V . 


Proof If p(X) = )-7_9 aX". Then p(a) = Lo aja! and p(a)* = Xoli (a*)!. 
Since aa* = a*a, it follows from the definition of the product that p(a) p(a)* = 
p(a)* p(a). Therefore, p(a) is anormal endomorphism of V. 


Proposition 18.11 Let V be a finitely-generated inner product space and let 
a be anormal endomorphism of V . If the minimal polynomial of a is com- 
pletely reducible, then it does not have multiple roots. 
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Proof Let p(X) be the minimal polynomial of a, which we assume is completely 
reducible. Assume that there exists a scalar c and a polynomial g(X) such that 
p(X) = (X — c)?q(X). Since p(w) = co, we have (œ — co1)?q (œ) = oo and so 
ker((a — co\)"q(a)) = V. By Proposition 18.10, we know that 6 = a — co, is a 
normal endomorphism of V. Let v € V and let w = q(@)(v). Then Bw) = Oy and 
so B(w) € im(B) N ker(B) = {Ov}. Thus we see that Bg(a@)(v) = Oy for all v € V 
and hence «œ annihilates the polynomial (X — c)q(X), contradicting the minimality 
of p(X). 


Proposition 18.12 (Spectral Decomposition Theorem) Let V be an inner 
product space finitely generated over C and let a be anormal endomorphism 
of V . Then there exist scalars c,..., Cn and projections 01, ..., Qn of V sat- 
isfying: 

(1) a=cyay +--+ + Cnn; 

(2) op =, +: +n; 

(3) anaj = 00 for allh F j. 

Moreover, these cj and a; are unique. The cj are precisely the distinct eigen- 
values of a and each a jis the projection of V onto the eigenspace Wj associ- 
ated with c; coming from the decomposition V = W; ® Wes 


Proof Let p(X) be the minimal polynomial of a, which we will write in the form 
p(X) = []_,(X — ci), where the c; are complex numbers which, by Proposi- 
tion 18.11, are distinct. For each 1 < j < n, let p;(X) be the jth Lagrange in- 
terpolation polynomial determined by the c;. 

Let f (X) be a polynomial of degree at most n — 1. Then the polynomial f (X) — 
yr f (ci) pi (X) is of degree at most n — 1 and has n distinct roots c1, ..., Cn. Thus 
it must be the 0-polynomial and so f(X) = ae F (ci) pi (X). In particular, we see 
that 1 = )°7_, pj(X) and X = } `}; ci pj(X). Set aj = pj (œ). Then o; = } ;—] a 
and a = }°;_, cja;. Note that aj # do since aj = pj; (a) and the degree of p(X) 
is less than the degree of the minimal polynomial of a. Moreover, if h ~ j then 
there exists a polynomial u(X) € CLX] satisfying «paj = u(a)p(a) = u(a)oo = 
ao. Thus we see that for all 1 < j <n we have aj = ajo, = } j] aja; = as and 
so each gj is a projection. Thus we see that {im(a;) | 1 < j < n} is an independent 
set of subspaces of V. 

Since the minimal polynomial and the characteristic polynomial of a have the 
same roots, we know that spec(a) = {c1,..., Cn}. To show that Wp = im(a,), we 
have to prove that a vector v belongs to im(q@,,) if and only if a(v) = cpv. In- 
deed, if a(v) = cpv then cnl j= aj(v)] = cnv = a (v) = Vai (cja) w) and 
so Vaca —cj)aj](v) = Oy. Thus, for all j # h, we have aj(v) = Oy and so 
v = æn (v) € im(ap). 

Finally, we note that œp is the projection coming from the decomposition 
V=W;® W+ since œp is a polynomial in œ and hence normal and so the result 
follows from the remark after Proposition 18.8. 
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Note that we could have deduced Proposition 18.12 directly from Proposi- 
tion 18.7. What is important in the above proof is the explicit construction of the 
projection maps as polynomials in g. 

If æ = )-"_, cia; is as in Proposition 18.12, then ak = ©] ciai) = 
4 cta for any positive integer k, and from this we see that if p(X) € C[X] 
then p(w) = X; p(cidai- 


Proposition 18.13 Let V be an inner product space finitely generated over C. 
A normal endomorphism a of V is positive definite if and only if each of its 
eigenvalues is positive. 


Proof If a is positive definite then, by Proposition 17.11, each of its eigenvalues 
is positive. Conversely, assume each of the eigenvalues of œ is positive. By Propo- 
sition 18.12, we write œ = an cja;, where the c; are the eigenvalues of a and 
the a; are projections in End(V) satisfying aja; = oo fori # j. If OV Ave V 
then (a(v), v) = yy i=l ci (ai (v), @j(v)) = X; cilla: (w)? > 0 and so « is 
positive definite. 


Example Let V = R?. For each a € R, let ag € End(V) be the normal endo- 

morphism of V represented with respect to the canonical basis by the matrix 
laa 
a 1 a |. Then spec(a@) = {2a + 1, 1 — a} and so, by Proposition 18.13, œ is 
aa 1 

positive definite precisely when —1 < 2a < 2. 


As a consequence of Proposition 18.13 and the comments before it, we see that 
if œ is a positive-definite endomorphism of a finitely-generated inner product space 
V over C then there exists an endomorphism ./a of V satisfying (væ)? = a. This 
endomorphism is defined by Væ = )~_, (,/ci)ai, where the c; are the eigenvalues 
of a, and where the œ; are defined as in Proposition 18.12. In particular, if 6 is an 
automorphism of V then, by Proposition 17.10, we can talk about ./B*, which is 
also positive definite by Proposition 18.13. 


Proposition 18.14 Let V be an inner product space finitely generated over C 
and let a € Aut(V ). Then there exists a unique positive-definite automorphism 
0 of V and a unique unitary automorphism of V satisfying a = ye. 


Proof By Proposition 17.10, we know that the automorphism a*a of V is posi- 
tive definite and so we can set 0 = Va*a. Let gy = 0a~!. Then g* = (a~!)*9* = 
(a*)—'6 so g*p = (a*)!00a-! = (a*)~!a* aa! = 04, proving that ¢g is unitary 
by Proposition 18.1, and hence belongs to Aut(V). If we now define y = 7t, we 
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see that æ = Y0. Moreover, we note that 0 € Aut(V) since 6 = ga. To prove unique- 
ness, assume that y6 = w’0’, where w and w’ are unitary automorphisms of V and 
where 0 and @’ are positive-definite automorphisms of V. Then y? = W6*0y = 
w'(0')*6'w' = (W’)*. Since y is positive definite, this implies that y = y’ and so, 
since y is an automorphism, we have 0 = YT! ye = yaly =0". 


The representation of an automorphism « of an inner product space finitely gen- 
erated over C in the form given in Proposition 18.14 is sometimes called the polar 
decomposition of a.' If we move over to matrices, we see that the polar decom- 
position of a nonsingular matrix A € My xn(C) is of the form A = UM, where U 
is a unitary matrix and M is a positive-definite Hermitian matrix. Similarly, there 
exists a unitary matrix U’ and a Hermitian matrix M’ satisfying A” = U’M’' and 
so A = M’(U’)", where (U’)# is again unitary. In the case we are working over R, 
the matrix U is orthogonal, and M is symmetric and positive definite. Because po- 
lar decompositions are important in applications, several iterative algorithms exist 
to compute them. 


Example If a and b are nonzero real numbers, then the polar decomposition of the 


. |a —b]. |cos(@) —sin(@)}}r 0 _ b = 
matrix E | e >] i j! where 0 = arctan(>) and r = 
~a? + b2. 


Proposition 18.15 (Singular Value Decomposition Theorem) Let V and 
W be inner product spaces of finite dimensions k and n, respectively, and 
let a c Hom(V, W). Then there exists an integer t < min{k,n}, together 
with positive real numbers cı > cz > +--+: > c; and with orthonormal bases 
{vi,... uk} of V and {w1,... wn} of W satisfying 


ciwi ifl<i<t, 
a= jami F's 
Ow otherwise 
and 


a* (wi) = civ; ifl<i<t, 
Oy otherwise. 


Proof If œ is the 0-map, then the result is immediate, so assume that is not 
the case. We note that 6 = œ*œ is a selfadjoint endomorphism of V and so, 
by Proposition 17.7, it is orthogonally diagonalizable. Hence V has an orthonor- 
mal basis {v;,..., vg} composed of eigenvectors of 6, where each v; is associ- 
ated with an eigenvalue b;. By Proposition 17.10, we know that each b; belongs 


‘Polar decompositions were first studied by the French engineer Léon Autonne at the beginning 
of the twentieth century. 
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to R. Moreover, for each i we note that b; = b; (vi, vi) = (bivi, vi) = (B (vi), vi) = 
(a(v;), @(v;)) > 0. Indeed, renumbering if necessary, we can assume that there ex- 
ists an integer t < k such that bı > b2 > --- > b; > 0 while bj41 =--- = bg = 0. 
For each 1 <i < t, set ci = Jb; and let w; = cy 'a(vi) e W. If i # j then 
(wi, wj) = (cici) lalui), @(vj)) = (ciej) (B (yj), vj) = (ciej) 1B; (vi, vj) = 0 
while, for each 1 <i < t, we have (w,;, wi) = c7” (æ(vi), a(v;)) = c7” (B(ui), vj) = 
cp (bivi, vj) = (vi, vi) = 1. Thus we see that the set {w),..., wr} is orthonormal. 
Moreover, for each 1 <i < t we have la (v;) ||? = b; so ||æ(v;)|| = c; and a*(w;) = 
a* (c7 'aæ(vi)) = c7 'a*a(vi) = c, 'B(v) = c7 bivi = civi. Fort+1<i<k we 
have a*a(v;) = B(v;) = Oy and so 0 = ({ (v;i), vi) = (æ (vi), a(v;)), which implies 
that a(v;) = Ow. Thus v; € ker(a) for eacht +1 <i< k. 


We are therefore left with the matter of defining w;+1,..., Wn in the case t <n. 
By Proposition 16.18, we know that ker(a@*) = im(a)+ and so, if we pick an or- 
thonormal basis {w;+1,..., Wn} for ker(a*) we see that {w 1, ..., Wn} is an orthonor- 


mal basis for W having the desired properties. 


The first version of the Singular Value Decom- 
position Theorem was proven by the nineteenth- 
century Italian mathematician Eugenio Beltrami; 
it was subsequently extended by many others, in- 
cluding Camille Jordan and Sylvester. Schmidt 
extended this theorem to infinite-dimensional 
spaces. Effective algorithms for computation of 
singular value decompositions were developed by 
the twentieth-century American computer scientist Gene H. Golub, along with William 
Kahan. 


The scalars cj > c2 >--- > c; given in the Proposition 18.15 are called the sin- 
gular values of the linear transformation a. The number c1/c;, called the spectral 
condition number, is used as a measure of the numerical instability of the matrix 
representing a*a € End(V) with respect to the given basis. 

If we consider the special case of a linear transformation a : Ck + C” repre- 
sented with respect to the canonical bases by a matrix A € Mpnxk(C), the Singular 
Value Decomposition Theorem says that there exist unitary matrices P € Mnxn(C) 


and Q € Mxxx(C) such that A can be written as P la 4 QF, where D € 


Mr x+(R) is a diagonal matrix having the singular values of a on the diagonal. 
These singular values are precisely the square roots of the eigenvalues of A” A. The 
columns of Q form an orthonormal basis for C* consisting of eigenvectors of A” A, 
and the columns of P form an orthonormal basis for C”. 

If æ : R* > R” then, of course, the matrices P and Q are orthogonal and 


D O 
a=r|o als 
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20 20 —20 20 
Example The matrix A = + 1 17 1 —17 | can be written as a product 
18 6 18 —6 


4 0 0 0 
P|}0 3 0 0 |Q, where 
0 0 2 0 


1 1 —1 1 

1/50 0 Ili 1 1 
P=-|0 3 —4 and Q=- 

5 0 4 3 2|1 -l 1 1 

1 -1 -1 -1 


are orthogonal and where the singular values of A are 4, 3, 2. 


Singular value decompositions have many applications, and play important roles 
in the mathematics of optimization, data compression, population genetics, and im- 
age processing. They are especially useful since accurate and relatively-efficient 
algorithms for computing these decompositions are readily available in many com- 
mon linear-algebra software packages. In particular, in many applications one needs 
to compute the singular value decomposition of a product of a large number of ma- 
trices (often over 1,000) and there exist algorithms to do that without having to 
multiply out the matrices explicitly. 


Exercises 


Exercise 1108 
Let A € Myx (C) be similar to a unitary matrix. Is AT! necessarily similar 
to A#? 


Exercise 1109 
Let n be a positive integer and let A € Mnxn (C) be a nonsingular matrix having 
a singular value decomposition A = PDQ", where P and Q are unitary matri- 


c1 O 

ces and D= KA is a diagonal matrix with cı > --- > cn. If Bisa 
(0) Cn 

singular matrix, show that || A — B || > cn, where || - || denotes the spectral norm. 


Exercise 1110 

Let n > 1 be an integer and let V be the subspace of C[X] consisting of all 
polynomials of degree at most n. Let 0 Æ c € C and let œ be the endomorphism 
of V defined by a: p(X) + p(X + c). Is it possible to define an inner product 
on V relative to which œ is normal? 
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Exercise 1111 
Let a, b,c € C. Find the set of all triples (x, y, z) of complex numbers satisfying 


ax y 
the condition that the matrix | O b z | represents a normal endomorphism 
0 0 c 


of CÌ, endowed with the dot product, with respect to the canonical basis. 


Exercise 1112 
Show that any Givens rotation matrix in M2x2(R) can be written as the product 
of two Jacobi reflection matrices. 


Exercise 1113 

Let n be a positive integer. A matrix A E€ Myxn(C) is normal if and only if 
A” A= AA". Show that every normal upper-triangular matrix is a diagonal ma- 
trix. 


Exercise 1114 

Let n be a positive integer and let A E€ Mnxn(R). Then A is normal if and only 
if AT A = AA’ | If A is normal, is eĉ normal? Is the converse of this statement 
true? 


Exercise 1115 
Let V = R, together with the dot product. Show that a matrix in M2x2(R) is of 
the form ® gpg (æ) for some normal endomorphism a of V which is not selfadjoint 


; =] for real numbers a and b Æ 0. 


if and only if it is of the form | 
Exercise 1116 

Let V be an inner product space finitely generated over R and let S be the set of 
all isometries V. Is S an R-subalgebra of End(V)? 


Exercise 1117 

Let n be a positive integer and let V = C” on which we have the dot product. If 
a € End(V), let G(a@) = {(a(v), v) | ||v|] = 1}. For the special case n = 2, find 
G(a) and G(8), where œ is represented with respect to the canonical basis by 


. |1 0 ; ; 3 ; 
the matrix | and £ is represented with respect to the canonical basis by 


0 0 
. JO 2 
the matrix t al ; 


Exercise 1118 

Let V = R? on which we have the dot product, and let W be the space of all 
polynomial functions in RF of degree at most 2, on which we define the inner 
product ( f, g) = i Ft (x)g(x) dx. Let a e Hom(V, W) be defined by 
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a 
b 
a b ee 1+ St E+ cx + cx. 
c 


Is this linear transformation an isometry? 


Exercise 1119 
Let V be an inner product space and let œ be an endomorphism of V satisfying 
the condition that w*a = oo. Show that a = oo. 


Exercise 1120 
Let V = R? with the dot product, and let œ be the automorphism of V defined by 


a —c 
a: | b || | —b |. Is « unitary? 
c —a 


Exercise 1121 


000 i 
. {0 0 1 0 f 
Is the matrix 01 00€ M4x4(C) unitary? 
i 000 


Exercise 1122 
Find a real number a satisfying the condition that the matrix 


—9 + 8i 10—4i —16-— 18i 
a| —2—24i 1+12i -10-—4i | €M3,3(C) 
4— 10i 2—24i 9+ 8i 


is unitary. 


Exercise 1123 
Find a real number a satisfying the condition that the matrix 


12 6— 12i 12+61 6-6 


1 | 64 12i a 5i 3+i 
74| 12-6i —Si a dea |S an 
6+ 6i 3-i 1+3i —22 
is unitary. 
Exercise 1124 
Given a real number a, check if the matrix 
—sin‘(a)+icos*(a) (1 +i)sin(a)cos(a) 
te ia 2 ey) € M2x2(C) 
(1 +i) sin(a)cos(a) —cos*(a) + i sin“ (a) 


is unitary. 
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Exercise 1125 
Find all possible triples a, b, c of real numbers, if any exist, such that the matrix 


1 —2 2 
4| 2 —1 2 | is orthogonal. 
a b c 


Exercise 1126 


1 —2a 2a’ 
For which a € R is a 2a 1—2a* 2a | € M3x3(R) an orthogonal 
2a? 2a 1 
matrix? 
Exercise 1127 
1 1 1 1 
Is th aii oe R) orth 1? 
s the matrix 5 I l € Ma4x4(R) orthogonal? 
1 —1 -1 1 


Exercise 1128 


If v = H € R?, show that there exists an orthogonal matrix A € M2x2(R) and 


a real number b satisfying the condition that Av = Hi 


Exercise 1129 
Let a and b be real numbers, not both equal to 0. Show that the matrix 


ab a(a+b) b(a+b) 
—>——— | a(a+b) —b(a+b) ab e M3 x3(R) 
a? +ab+b? | path) ab —a(a +b) 


1 


is orthogonal. 


Exercise 1130 
2a —2a a 


Find all a € R such that the matrix | —2a  —a 2a | € M3x3(R) is orthog- 
a 2a 2a 
onal. 


Exercise 1131 


4 -1 1 
Let A= | —1 4 —1 | € M3x3(R). Find an orthogonal matrix P such that 
1 -l 4 


PT" AP is a diagonal matrix. 
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Exercise 1132 


1 00 0 
0 0 1 0 . : 

Let A= 010 0 € M4x4(R). Find an orthogonal matrix P such that 
0 0 0 1 


P" AP isa diagonal matrix. 


Exercise 1133 
Let n be a positive integer and let A and B be orthogonal matrices in M,,.,(R) 
satisfying |A| + |B| = 0. Show that |A + B| = 0. 


Exercise 1134 
1 -1 272 
Let A= 5 2/2 2/2 0 €e M3,3(R). Find an infinite number of pairs 
-1 1 2⁄2 


(P, Q) of orthogonal matrices such that P AQ is a diagonal matrix. 
Exercise 1135 


4 

z 0 

5 
Find an a € R such that the matrix a 0 | € M3x3(R) is orthogonal. 
0 1 


Oue & 


Exercise 1136 
Let A, B € Mkxn(R) be matrices such that the columns of each form orthonor- 
mal bases for the same subspace W of R*. Show that AAT = BB’. 


Exercise 1137 


Let A,B € My xn(R) be orthogonal matrices. Is the matrix E | € 


Monx2n(R) necessarily orthogonal? 


Exercise 1138 
Let n be a positive integer and let A € Myx» (IR) be a skew-symmetric matrix. 
Show that (A — I)! (A + J) is an orthogonal matrix which does not have | as 


an eigenvalue. 


Exercise 1139 


Find two distinct functions fi, f2 : R4 \ — R satisfying the condition 


oooco 


that 
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PLP acta 2(bc — da) 2(bd — ca) 
2(bc — da) a? + b? — c? — d? 2(cd — ba) 
2(bd — ca) 2(cd — ba) a +b? — e -— d? 


aa Sna 


is always an orthogonal matrix. 


Exercise 1140 
Let n be a positive integer and let A, BE Mnxn(R) satisfy A? + B? = I. Is the 


matrix ae 
B A 


| € Monx2n(R) necessarily orthogonal? 

Exercise 1141 

Let O 4 A € M3x3(C) be a matrix satisfying adj(A) = A”. Show that A is a 
unitary matrix having determinant 1. 


Exercise 1142 
Let n be a positive integer and let œ be the endomorphism of C” defined by 
æ : v > iv. Is æ normal? 


Exercise 1143 
Let V be an inner product space and let a, 6 e End(V) be normal. Is Ba neces- 
sarily normal? 


Exercise 1144 

Let V be an inner product space finitely-generated over C and let a € End(V) 
satisfy the condition that every eigenvector of 6 = a + a* is also an eigenvector 
of y =a — a*. Prove that œ is normal. 


Exercise 1145 
Let œ be the endomorphism of C? represented with respect to the canonical basis 


by the matrix A = i T Is æ normal? 


Exercise 1146 
Let V be an inner product space over C and let œ € End(V) be normal. If c € C, 
is the endomorphism «œ — co, necessarily normal? 


Exercise 1147 
Let V be an inner product space finitely generated over C and let oo Aa € 
End(V) be normal. Show that « is not nilpotent. 
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Exercise 1148 
Let a,b € R let a € End(R?) be represented with respect to the canonical ba- 


a —2 b 
sis by b a —2 |. For which values of a and b is this endomorphism 
=2 3 a 


normal? 


Exercise 1149 
Let œ € End (R?) be represented with respect to the canonical basis by the matrix 


14 2 14 
5 2 —1 -—16 |. Show that @ is selfadjoint and find an orthonormal basis 
14 —16 5 


of R? composed of eigenvectors of a. 


Exercise 1150 
Let œ be the endomorphism of CÌ? represented with respect to the canonical basis 


6 -2 3 
by the matrix 3 6 —2 |. Show that «œ is normal and find an orthonormal 
—2 3 6 


basis of C? composed of eigenvectors of a. 


Exercise 1151 

Let V be an inner product space and let oo 4 a € End(V) be a normal projection. 
Show that ||a@(v)|| < ||v|| for all v € V, with equality whenever v € im(a). Give 
an example where this does not hold for œ which is not normal. 


Exercise 1152 
Let n be a positive integer and let F be any field. A matrix A € My xn(F) is 


antiorthogonal if and only if A7! = — AT . Give an example of an antiorthogonal 
matrix in M2x2(GF(3)). 
Exercise 1153 
a a1 
Let w € End(R*) be defined bya: nats: a2 . Show that œ is normal 
a3 a3 + a4 
a4 a4 — a3 


but not selfadjoint. 


Exercise 1154 
Let œ : R? — R? be the linear transformation represented with respect to the 
1 1 


canonical bases by the matrix A= | 2 2 |. Find the singular values of a. 
2.2 
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Exercise 1155 


Let n be a positive integer, let F be a field, and let J be the identity matrix in 
Maxn(F). A matrix A € Moy x27 (F) is symplectic if and only if 


If B,C € Mnxn(R), show that B+iC € My xn(C) is unitary if and only if the 


matrix E p € Manx2n (R) is symplectic. 
Exercise 1156 
0 --1//2 d 
For which c,d € C is the matrix c 1/2 i/2 | Hermitian? For 


ijJ2 —i/2 1/2 


which values of c and d is it unitary? 


Exercise 1157 

A polynomial p(X) € C[X] of degree n > 0 is a reciprocal polynomial if and 
only if p(X) = +X” p(X7!). Show that characteristic polynomials of orthogonal 
matrices are reciprocal and that the set of all reciprocal polynomials, together 
with the 0-polynomial, forms a subalgebra of C[X]. 


Exercise 1158 
(Cayley representation) For any real number t, with cos(t) Æ —1, find a skew- 
symmetric matrix A € M2x2(R) satisfying 


cos(t) sin(t) = 
. =(I[—A)U+A)™. 
—sin(t) cos(f) 
Exercise 1159 
Let V be a vector space finitely generated over C and let œ be an automorphism 
of V having polar decomposition a = 6, where y is unitary and 0 is positive 
definite. Show that œ is normal if and only if æ = 6. 


Moore-Penrose Pseudoinverses 1 © 


Let V and W be inner product spaces, and let a: V —> W be a linear transforma- 
tion. We know that there exists a linear transformation 8 : W — V satisfying the 
condition that Bq is the identity function on V and a is the identity function on 
W if and only if œ is an isomorphism; in this case, 8 = a~!. If both spaces are 
finitely generated, we also know that such an isomorphism can exist only when 
dim(V) = dim(W). If œ is not an isomorphism, it is possible to weaken the notion 
of the inverse of a function. Given a linear transformation a: V > W, we say that 
a linear transformation 6B : W — V is a Moore—Penrose pseudoinverse of a if and 
only if the following conditions are satisfied: 

(1) aba =a and Bas = p; 

(2) The endomorphisms Ba € End(V) and wf € End(W) are selfadjoint. 


With kind permission of the Archives of the Mathematisches Forschungsinstitut Ober- 
wolfach (Penrose). 

Eliakim Hastings Moore developed this construction in 1922, but it did 
not receive much attention at the time; it was rediscovered independently 
in 1955 by Sir Roger Penrose, a contemporary British applied mathe- 
matician, best known for his collaboration with the physicist Stephen 
Hawking. 


Example The two parts of condition (2) in the definition of the Moore—Penrose 
pseudoinverse are independent. To see this, consider the linear transformation 


a : R? —> R? defined by a: v > l ale For any c,d € R, let £ : R? > R? 
1—3c —2-—3d 
be the linear transformation defined by $ : wh 0 1 . Then one 


c d 
can check that «fæ = a and BaB = f and that aß = cı in End(R?). On the other 
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1—3c 6c—3d 3-—9c 
hand, Ba: vt 0 1 0 v, and it is easy enough to choose 
c 2c+d 3c 
c and d so that this matrix is not symmetric and hence fa is not selfadjoint. 


We will denote the Moore—Penrose pseudoinverse of œ by a+. Of course, in order 
to justify this notation we have to show that 6 exists and is unique, which we will 
do for the case that V and W are finitely generated. We will begin with uniqueness. 


Proposition 19.1 Let V and W be inner product spaces and leta: V > W 
be a linear transformation. If « has a Moore—Penrose pseudoinverse, it must 
be unique. 


Proof Suppose that £, y € Hom(W, V) are Moore-Penrose pseudoinverses of œ. 
Then £ = Bap = (Ba)*B = a*B*B = (aya)*B*B = (ya)*a* B*B = yao" B* B= 
ya(Ba)*B = yapaß = yap = yayap = y(ay)"ap = yy*a*ap = y y*a* (wB)* 
= yy*(aBa)* = yy*a* = y(ay)* = yay = y and so we have proven unique- 
ness. 


In particular, if œ : V — W is an isomorphism, then, by Proposition 19.1, we 
have a+ =a7!. If æ is the 0-function then so is a*. 


Proposition 19.2 Let V and W be finitely-generated inner product spaces 

and let a: V — W be a linear transformation. 

(1) Ifa isamonomorphism, then a* exists and equals (a*a) la*. Moreover, 
ata is the identity function on V; 

(2) Ifa is an epimorphism, then a* exists and equals a* (aa*)7!. Moreover, 
æa* is the identity function on W. 


Proof (1) From Proposition 16.20, we see that if œ is a monomorphism then 
a*a € Aut(V), and so (a*w)~! exists. Set 6 = (a*a)~!a*. Then fa is the iden- 
tity function on V, and so fa is a selfadjoint endomorphism of V which satis- 
fies aBa = a and Bap = £. Finally, (#B)* = [a(a*a)~!a*}* = a[ (a*a) !]*a* = 
a[(o*a)*]~!w* = a(a*a)—!a* = aß, and so af is also selfadjoint. Thus 6 = at. 
(2) From Proposition 16.20, we see that if œ is an epimorphism then aa” € 
Aut(W) and so (wa*)~! exists. As in (1), we see that a*(aa*)~! =at. 


Example Let a : R? — R? be the linear transformation represented with respect to 
1 2 

the canonical bases by the matrix | —1 3 |. This is a monomorphism and so, by 
2 4 
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Proposition 19.2, wt exists and is represented with respect to the canonical bases 
3 —10 6 


by the matrix 5 Í 5 2b 


Proposition 19.3 Let V and W be finitely-generated inner product spaces. 
Then every a € Hom(V, W) has a Moore-Penrose pseudoinverse at € 
Hom(W, V). 


Proof Let Y = im(«), and write a = up, where B : V —> Y is an epimorphism 
given by £ : v > a (v), and u : Y > W is the inclusion monomorphism. By Propo- 
sition 19.2, we know that B+ € Hom(Y, V) and w+ € Hom(W, Y) exist and satisfy 
the conditions that BB* and u™ u are equal to the identity function on Y. Therefore, 
we see that (up) (8% u™) (up) = up and (B* u*)(uB)(B* u*) = ptu" and we see 
that (Bt u*) (ub) = B*B and (uB)(B* u*) = uu* are selfadjoint. Thus at exists 
and equals Bt ut. 


As an immediate consequence of this, we note that if œ is an endomorphism 
of a finitely-generated inner produce space V then, by Proposition 6.11, we see 
that rk(aa@t) < rk(@) and rk(œ) = rk(aat@) < rk(aat) and so rk(aa*) = rk(a). 
Similarly, rk(at@) = rk(@). 

If F is either R or C, and if we are given a linear transformation a : F > F” 
which is represented with respect to the canonical bases by the matrix A = [a;;], 
then we will denote the matrix representing wt with respect to these bases by AT. 
Thus the matrix At has the following properties: 

(1) AATA = A and ATAAt= AF; 
(2) The matrices AAT and At A are symmetric (in the case F = R) or Hermitian 
(in the case F = ©). 


1 —1 2 
Example Let A= | 2 1 —2 | e M3,3(R). This matrix is clearly singular and 
3 0 0 
5 5 10 
hence AT! does not exist. However, we can check that At = k —5 4 -] 
10 —8 2 


For nonsingular square matrices of the same size A and B, we know that 
(AB)! = B7! AT!. A similar equality does not hold for the Moore-Penrose pseu- 
doinverse, as the following example shows. 


444 19 Moore-Penrose Pseudoinverses 


6 


2 
Example If A = | 7 14 


| and B = E 1 in M2x2(R), then AB = È a 


1 2 2 1 
1 1 7 . 
Then At = 16 d and Bt = [> | so BtAt = miala 3 |; white 


The Singular Value Decomposition Theorem can be used to compute pseudoin- 
verses. This is important since, as we have remarked previously, there exist several 
relatively efficient and stable numerical algorithms for computing such decomposi- 
tions. 


Example Let a : Rt + R” be a monomorphism represented with respect to the 
canonical bases by a matrix A. By Proposition 19.2, we have At = (AT A)~!A?. 
By Proposition 18.15, we set A = PEQ", where P € Mkgxk(R) and Q € 


Mexn(R) are orthogonal matrices and E € Mxy,(R) is of the form E A 


for a diagonal matrix D € M,;,,(R), the diagonal entries of which are nonzero. 
Then 


At = (ATA) 'AT = (QET PT PEQ") QET P" 


sorro orso oue 


D! O T 
of? 2] 


1 


Example If A(t) = E 


| for all real numbers ¢ then we see that A(t)+ = 


| when t Æ 0, but is equal to for t = 0. Thus we see that not only 


0 1 0 

0 ct 0 0 

is lim;_,9 A(t) not equal to A(O), but indeed that the value of A(t) moves farther 
and farther away from A(0) as ¢ approaches 0. 


Thus we see that the Moore—Penrose pseudoinverse is not computationally sta- 
ble. This means that one has to be very careful in actual applications. Because of 
the importance and utility of Moore—Penrose pseudoinverses, there exists a consid- 
erable literature on techniques for computing A* or A* A, given a matrix A. One of 
the methods used in practice for computing the Moore—Penrose pseudoinverse over 
R is a recursive one, known as Greville’s method, which is based on the following 
result: If A € Mkxn (R), and if we write A=[B v], where B € Mgx(n-1) (R), then 
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Ata BYU -vAw) 
~ w 


$ where 


(| — BBY? U — BB*)v_ if || — BB*)v|] £0, 
(+| Btv]! (B+) Btu otherwise. 


© Mrs Greville (Greville); © Adi Ben-Israel (Ben-Israel) 


In the 1970s, the American mathematician 
Thomas N.E. Greville and the American/Israeli 
mathematician Adi Ben-Israel popularized and 
reinvigorated the use of the Moore-Penrose pseu- 
doinverse as a computational tool. 


Another technique is to break A up into blocks, if possible. Indeed, if A = 
be A12 


A A | where Aj, is a nonsingular square matrix the rank of which equals 
21 22 


* 
the rank of A, then, by Zlobec’s formula, we have At = | Ai1 Ai2 |" B* Pa , 


A21 
A —1 
where B= ([An anla |an) ; 


One can also use convergence methods to compute the Moore-Penrose pseudo- 
inverse of a matrix. If A € Mkxn(C) then, by Proposition 17.4, we know that 
the eigenvalues of A*A are real. Let c be the largest such eigenvalue and pick a 
real number b satisfying 0 < bc < 2. For each integer p > 2, define the sequence 
Yo, Y1, ... of matrices in M,,.%(C) as follows: 

(1) Yo =bA*; 
(2) If k > 0 and Yg has already been defined, set Tk = J — Y,A and set Yr, = 
Yk + ge Ti Yz. Then the sequence Yo, Y1, ... converges to A. 

Another method is the following: if A € My xn(R) is an arbitrary symmetric 
matrix we can define matrices Ag, Aj, ... by setting Aj = A and Ag41 = [Z + (I — 
Ap (I + Ag)! ]Ak for all k > 0. Also, we can define real numbers co, c1, ... by 
setting co = 1 and c;+1 = 2c; + 1 for each i > 0. Then the Kovarik algorithm states 
that if none of the numbers =c; l is an eigenvalue of A, the sequence Ao, A},... 
converges to ATA. 

Let F be either R or C, let k and n be positive integers, and let A E€ Mxxn(F). 
We now look at what the matrix AT says about a solution (if any) to a system of 
linear equations of the form AX = w, where w € F k First of all, we note that in 
general the following proposition holds. 
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Proposition 19.4 Let F be either R or C, let k and nbe positive integers, let 
A € Mgxn(F), and let w € F*. The system of linear equations AX = w has 
a solution if and only if (AAt)w = w. 


Proof If there is a vector v € F” satisfying Av = w then (AAt)w = (AA®) (Av) = 
(AAT A)v = Av = w. Conversely, if (AAt)w = w then A(ATw) = w and so 
Av = w, where v = Atw. 


We also note that, in the situation above, if y € F” then A(T — AT A)y = 


0 

and so we also see that At w + (J — At A)y is also a solution to the system AX = w, 

assuming that the system has any solutions at all. Conversely, any solution to this 
0 


system is of the form At w + u, where Au = | : |, and so (J — At A)u =u. 


Proposition 19.5 Let F be either R or C, let k and n be positive integers, 
and let A E€ Mkxn(F). Let w € FF. If the system AX = w has a solution then 
in the set of all solutions to this system of linear equations there is precisely 
one having a minimal norm, and it is AT w. 


Proof If u is a solution to this system, then we have already seen that it is of the 
form Atw + (I — At A)y. But we note that 

(Atw, (I — At A)y) = (At AAT w, (I — ATA)y) 
(Atw, (At A)(I — ATA)y) 
( 


Atw, (AtA — AtAA™A)y) 


0 


so 
lul? = (u, u) =(Atw + (I — At A)y, Atw + (I — AtA)y) 

=(Atw, Atw)+((1 — At A)y, (I — AT A)y) 
= |Atwl?+ |( - 4*a)yl?, 


which implies that |u|] > || AT wl. 
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Example Let A = Bl Tl andl. | maaa |e | aus 
1 1 1 2 14 
1 4 
8 
the solution to the system AX = w having minimal norm is AT w = 4 11 |. Its 
9 


norm is hi / 266. 


But what happens if the system AX = w does not have a solution? Suppose that 
F is either R or C and that A € Mkxn(F), and w € F*, where k and n be positive 
integers. Then the system (At A)X = A*w always has a solution, namely A*w, 
and, by Proposition 19.5, this is in fact the solution of minimal norm of this equation, 
which is the best approximation to a solution of AX = w. 


Example Consider the system of linear equations AX = w, where 


2 -4 5 1 
6 0 3 3 
A=], _4 5| and w=] ]}] 
6 0 3 3 
Then 
, [72 6 -2 6 
aoa, 5 3 -5 3 and Atw=-] 1 
40 40 


In order to emphasize the use of Proposition 19.5, we briefly consider the least 
squares method, which is an important tool in many areas of applied mathemat- 
ics and statistics. This method was developed at the beginning of the nineteenth 
century by Gauss and Legendre and, independently, by the American mathematical 
pioneer Robert Adrain. Suppose that we have before us the results of several ob- 
servations, which, depending on values f),...,¢%, of a real parameter, give us real 
values c1, ..., Cn. Our theory tells us that the set of points {(4;,c;) | 1 <i <n} in 
the Euclidean plane should lie on a straight line. However, because of measur- 
ing and/or computational errors, this does not quite work out. So we want to find 
the equation of the line in the plane which best fits our observed data. In other 
words, we want to find a solution of minimal norm to the system of linear equations 

l 4 c1 
Xı c2 
a =]; 


l t Cn 


1 b 
{Xi +tiX2=ci|1 <i <n}, which can be written as |. . | 
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ti Cl 

t2 c2 

As we have seen, the solution of minimal norm, if it exists, is : 
l tn Cn 


Otherwise, this is the best approximation to the solution the system. 


With kind permission of the University of Pennsylvania Libraries. 


Irish-born Robert Adrain emigrated to the United States in 1798. He 
published his own mathematics journal, but his work received no in- 
ternational attention at the time. 


Example To find the equation of the line in the Euclidean plane which best fits the 
set of points {(1, 3), (2, 7), (3, 8), (4, 11)}, we calculate 


tr 3 
7|_ (120 10 0 —10 
8] \20|—-6 —2 2 6 
1 


so the line we want is given by {(t, 1 + 3t) | t € R}. 


— co YW 
| 
NIle 
m~a 
a N 
ia 


1 1 
1 2 
1 3 
1 4 1 1 


We can use the same method to find the best fit of any polynomial of a higher 
degree to a set of points. For example, if we wish to find a parabola which best fits 
the set of points {(¢;, ci) | 1 < i < n} in the Euclidean plane, we have to find a best 
approximation to a solution of the system of linear equations {X1 + ti X2 + i =G; 


1 ti ti C1 
: : : to t2 c2 

1 <i <n}, which we know is ° 
T ti A Cn 


Example To find the equation of the parabola in the Euclidean plane which best fits 
the set of points {(1, 3), (2, 7), (3, 8), (4, 11)}, we calculate 


Pr yrs) pipas as -s spf? 
=|>|-31 23 27 -19 
13 9 8 OG)" ce e 3 8 
14 16} |11 11 
[= 
=-| 15], 
4 
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and so the parabola we want is given by {(f, -ł} + Rt — 13) |t € R}. 


Needless to say, we can also consider a much more general context. Suppose that 
W is a finitely-generated subspace of IR“. Given a set of observations {(t;, ci) |}1< 
i <n}C A x R, we want to find the function g € W which best approximates these 
observations. 

To do this, we pick a basis {f1,..., fk} for W. Then we want to find a best 
approximation to a solution of the system of linear equations 


{Xi fii) +--+ Xe felts) =c1 | 1 <i <n}, 


fit) -e Set) | | Xi c1 
which can be written as : om : : |=] : |. As we have seen, 
fiir)... fk Cn) Xk Cn 
AD o AD Sa 
this is : ee : : 
Filta) --. fk Cn) Cn 

Least-squares approximations are often used to find best-fit solutions to very 
large systems of linear equations of the form AX = w which, in theory, have an 
exact solution but in practice that solution cannot be found because of errors in 
measurement of the data and computational errors. Indeed, Gauss developed this 
method for finding solutions to the very large systems of linear equations which 
resulted from laying down a triangulation grid for a geodetic survey of the state of 
Hanover he conducted in 1818. In 1978, the American National Geodetic Survey 
used it to solve a system of over 2.5 million linear equations in 400,000 unknowns 
which resulted from the updating of the triangulation grid for the continental United 
States. 

The constructions presented in this chapter can be generalized considerably. In- 
deed, if (K, e) is an associative unital algebra over a field F on which we have 
defined an involution a > a*, then an element a of K has a Moore-Penrose pseu- 
doinverse b if and only if the following conditions are satisfied: 

(1) aebea=aandbeaeb=b; 
(2) (bea)* =beaand (aeb)* =aeb. 

Proposition 19.1 can easily be modified to show that such a pseudoinverse, if it 
exists, is unique. Pseudoinverses of this sort show up in the study of C*-algebras, or, 
more generally, associative unital algebras (K, e) that satisfy the Gelfand—Naimark 
property , namely that e + a* ea is a unit of K for each a € K, where e is the 
multiplicative identity of K. In such algebras, it is possible to show that if a € K 
satisfies the condition that there exists an element b € K satisfying ae bea =a and 
b e a e b = b, then a has a Moore-Penrose pseudoinverse. 
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With kind permission of the Archives of the Mathematisches 
Forschungsinstitut Oberwolfach (Gelfand); With kind permis- 
sion of the American Mathematical Society (Naimark). 
Israil Moisseevich Gelfand was a twentieth- 
century Russian mathematician who emigrated 
to the United States. He worked in many ar- 
eas of analysis and mathematical biology. Mark 
Aronovich Naimark was a twentieth-century 
Ukrainian mathematician who worked primarily in 
functional analysis. 


Finally, one should note that the Moore—Penrose pseudoinverse is just one of 
many “pseudoinverses” in the mathematical literature, each designed for a fairly 
specific purpose. The first of these was introduced by Fredholm in 1903 to deal with 
integral operators. Others are based on specific situations which arise in algebra or 
analysis, or which are used to implement specific computational methods. 


Example Let V be a vector space finitely generated over a field F and let 
a € End(V). Let k = inf{0 < h € N | rk(&”) = rk(@"*!)}. Then the Drazin pseu- 
doinverse of a is the endomorphism £ of V satisfying att! 8 = at, Bap = B, 
and af = Ba. If such a £ exists, it is necessarily unique. It is immediate that if 
a € Aut(V) then k = 1 and 6 =a~!. If œ is nilpotent then its Drazin pseudoinverse 
is o9. Drazin pseudoinverses have important applications in differential equations 
and in mathematical economics. 


Exercises 


Exercise 1160 


Let V and W be finitely-generated inner product spaces and let a e Hom(V, W). 
Let B € Hom(W, V) satisfy aBa = a. Show that rk(8) > rk(@), with equality 
holding if and only if paf = £. 


Exercise 1161 


1 
Let A = | —1 € M3x2(R). Calculate AT. 
2 


Ww” = = 


Exercise 1162 
Let A = [5 0 0] € M1ıx3(Q). Calculate AT. 


Exercise 1163 
Let A = [a1 ... an] € Mi xn(C), where n is a positive integer. Show that AT = 


(AA®)!AĦ. 
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Exercise 1164 


2 2 0 
Let A=|1 2 1 | €M3,3(R). Calculate A*. 
1 2 1 


Exercise 1165 
Let V and W be finitely-generated inner product spaces, and let a e Hom(V, W). 
For any nonzero scalar c, show that (cw)* = lat. 


Exercise 1166 
Let n be a positive integer and let A € Mn x»(R) be a diagonal matrix. Calcu- 
late AF. 


Exercise 1167 
Let V and W be finitely-generated inner product spaces, and let a e Hom(V, W). 
Show that (a*)* = (a@t)*. 


Exercise 1168 
Let V = R?, which is endowed with the dot product and let œ : V > R be the 
a 


linear functional defined by a : | b 


| > a. Let $ : RV be the linear transfor- 


mation defined by £ : ate E Show that (aB)t 4 Tat. 


Exercise 1169 
Let n be a positive integer and let A = [aij] E€ Mnxn(R) be the matrix all entries 
of which are equal to 1. Show that At = n~7 A. 


Exercise 1170 


Let A € Mxxn(R) be a matrix of the form | T where C is at x t nonsin- 


gular diagonal matrix. Show that At = F al where D = C™!. 
Exercise 1171 

Let V = R” on which we have the dot product defined, and let œ € End(V) satisfy 
the condition that ker(~) = im(a)+. Show that the restriction B of a to im(q@) is 
an automorphism of im(q@) and that the restriction of wt to im(a) equals 67. 


Exercise 1172 
Let n be a positive integer and let A, B € Mnxn(R) be matrices satisfying the 
conditions ABA = A, BAB = B, and A? = A. Isit necessarily true that B? =B? 
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Exercise 1173 

Let h,k,m, and n be positive integers, let A E€ Mpxk (R), let BE Mmxn(R), 
and let C E€ Mj, (IR). Show that there exists a matrix X € Mx x.(R) satisfying 
AXB =C if and only if AATCBTB=C. 


Exercise 1174 
Let k and n be positive integers and let B € Mgxk (R) and C € Myxn(R) be 
orthogonal matrices. For A € Mz, (IR), show that (BAC)+ = C7 A+B’. 


Exercise 1175 
Let k and n be positive integers and let A € Mgxk(R) and B E€ Mkxn(R). Let 
C € Mnxn(R) be nonsingular. Prove that 


A AB]* [At -AtABC7! 
o c| |o G 


Bilinear Transformations and Forms 2 0) 


Let V, W, and Y be vector spaces over a field F. We say that a function f : 
V x W => Y isa bilinear transformation if and only if the function v œ> f(v, wo) 
belongs to Hom(V, Y) for any given vector wọ € W and the function w > f (vo, w) 
belongs to Hom(W, Y) for any given vector vo € V. The set of all bilinear transfor- 
mations from V x W to Y will be denoted by Bil(V x W, Y). If f, g € Bil(V x 
W,Y) and if c e F then f + g and cf also belong to Bill(V x W, Y), and so 
Bill(V x W, Y) is a subspace of the vector space YY*W over F. Also, any bilinear 
transformation f : V x W — Y defines a bilinear transformation f° : W x V > Y, 
called the opposite transformation of f, by setting f°: (w, v) KH f(v, w). It is 
clear that the function 


OP : Bill(V x W, Y) > Bill(W x V, Y) 


is an isomorphism of vector spaces. We say that a bilinear transformation 
f € Bill(V x V,Y) is symmetric if and only if f = fP. It is skew symmetric if 
and only if f = — f°. 

In particular, if we consider a single vector space V over a field F, then we note 
that f € Bill(V x V, V) if and only if the operation e on V defined by ve w = 
f(v, w) turns V into an F-algebra. This algebra is commutative if and only if f is 
symmetric. 


Example Letn be a positive integer and let V = R”, on which we have the dot prod- 
uct defined. A classical problem in geometry is to ask if there exists a bilinear trans- 
formation f € Bill(V x V, V) satisfying the condition that || f (v, w)|| = Iv I| - Ilw || 
for all v, w € V. Euler showed that such a transformation exists for the case n = 4. 
At the end of the nineteenth century, Hurwitz showed that such transformations exist 
only when n = 1, 2, 4, or 8. 


J.S. Golan, The Linear Algebra a Beginning Graduate Student Ought to Know, 453 
DOI 10.1007/978-94-007-2636-9_20, © Springer Science+Business Media B.V. 2012 
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With kind permission of ETH-Bibliothek Zurich, Image Archive. 


Adolph Hurwitz was a nineteenth-century German mathematician 
who taught both Hilbert and Minkowski. 


Example Let F bea field and let k, n, and t be positive integers. Set V = Myyn(F), 
W = Mi xn(F), and Y = My y+(F). Then there exists a bilinear transformation 
V x W = Y defined by (A, B) > ABT. In particular, we have a bilinear transfor- 
mation F” x F” > Mnxn(F) given by (v, w) œ> uA w. More generally, if V, W, 
and Y are as mentioned, every matrix C € Mpnxn(F) defines a bilinear transforma- 
tion V x W —> Y by setting (A, B) ACB’. 


Example For vector spaces V and W over a field F, the function Hom(V, W) x 
V — W given by (a, v) + a(v) is a bilinear transformation. 


Let V, W, and Y be vector spaces over a field F. The image of a bilinear trans- 
formation f € Bill(V x W, Y) is not necessarily a subspace of Y, as the following 
example shows. 


Example Consider the bilinear transformation f : R? x R? > M2x2(R) defined 
0 


, but is not a 
1 0 


by f : (v, w) > vA w. The image of f contains E a] and | 
; 0 1 . 
subspace since | 1 A ¢im(f). 


As with linear transformations, bilinear transformations are totally determined 
by their behavior on bases. That is to say, let V and W be vector spaces over a 
field F, and let B = {v; | i € Q} and D = {w; | j € A} be bases of V and W, 
respectively. Let Y be a vector space over F and let fọ: B x D —> Y bea 
function. Then there exists a unique bilinear transformation f € Bill(V x W, Y) 
satisfying f(v;, wj) = fo(vi, wj) for all i and j, namely the function defined 
by f : Q icg4iti, LA bjw) e Vieg De ajb; fo(vi, wj). In the case that 
V = W = Y, we have already noted this fact in Proposition 5.5. 


Proposition 20.1 Jf V, W, and Y are vector spaces over a field F, then 
Bill(V x W, Y) is isomorphic to Hom(V, Hom(W, Y)). 


Proof Define a function 0 : Bill(V x W, Y) ~ Hom(V, Hom(W, Y)) as follows: 
given a bilinear transformation f €e Bill(V x W,Y) and a vector v € V, then 
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O(f)(v) : wr f(v,w). It is straightforward to check that indeed 0(f)(v) € 
Hom(W, Y) for all f € Bill(V x W, Y) and all v € V. Moreover, 0(f)(v1 + v2) = 
O(f)(v1) + 8(f)(v2) and 0 (f)(cv) = cO(f)(v) for all v, v1, v2 € V and all ce F, 
so 0(f) € Hom(V, Hom(W, Y)) for all f € Bill(V x W, Y). Finally, 0(f + g) = 
0(f)+4(g) and O(cf) = cO(f) for all f, g € Bill(V x W, Y) and all c € F, and so 
we have shown that 8 is a linear transformation. 

It is also possible to define a function 


gy: Hom(V, Hom(W, Y)) > Bill(V x W, Y) 


by setting g(a) : (v, w)  a(v)(w) for all v € V and w € W, and again it is easy 
to show that this is a linear transformation. If a € Hom(V, Hom(W, Y)) and ve V, 
then 0g(a)(v): wr g(a)(v)(w) = a(v)(w) and so 9g(a)(v) = æa (v) forall v € V. 
Thus 0¢(a@) = « for all a € Hom(V, Hom(W, Y)), and so 09 is the identity function 
on Hom(V, Hom(W, Y)). Conversely, if f € Bill(V x W, Y) then 


pOf): Ww, w) > OCF) w) = flv, w) 


for all v € V and w € W and so g@(f) = f for all f € Bill(V x W, Y), proving 
that g0 is the identity function on Bill(V x W, Y). Thus we have established that 0 
is an isomorphism, with 07! = 9. 


Let V and W be vector spaces over a field F. A bilinear transformation 
f:V x W —> F is called a bilinear form. We will denote the set of all such bilinear 
forms by Bill(V x W), instead of Bill(V x W, F). By what we have seen above, 
Bill(V x W) is a subspace of FY*W which is isomorphic to Hom(V, D(W)). If 
V and W are vector spaces over a field F, then a bilinear form f € Bill(V x W) 
is nondegenerate if and only if for each Oy Æ v €e V there exists a w € W sat- 
isfying f(v, w) #0 and for each Ow 4 w e W there exists a v € V satisfying 


fo, w) £0. 


With kind permission of the Archives of the Mathematisches Forschungsinstitut Ober- 
wolfach. 

Mathematicians at the beginning of the nineteenth century, such as 
Gauss and Jacobi, preferred to state their results in terms of bilinear 
forms rather than in terms of matrices. Sylvester contributed greatly to 
the theory of bilinear forms, as did the influential nineteenth-century 
German mathematician Karl Weierstrass. 


Example If V is an inner product space over R, then the function (v, w)  (v, w) 
belongs to Bill(V x V). This is not true, of course, if our field of scalars is C. 


Example If F is a field and V = F” for some positive integer n, then the function 
(v, w) œ> v © w belongs to Bill(F” x F”). This function is particularly useful in 
the case F = GF(2). Indeed, if v € GF(2)”, then 
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0 if an even number of entries in v are equal to 1, 


a | 1 ifan odd number of entries in v are equal to 1. 


This value is known as the parity of v. 
More generally, let A be a finite set and let V be the collection of all subsets of A. 
Define a function f : V x V —> GF(2) by setting 


(A, B)= 0 if AN B has an even number of elements, 
fA, ~ |1 if AM B has an odd number of elements. 
Then f € Bill(V x V). 


Example If V is a vector space over a field F, we have a nondegenerate bilinear 
form in Bill(D(V) x V) given by (ô, v) > d(v). Similarly, if 61,62 € D(V), we 
have a bilinear form in Bill(V x V) given by (v, w) > ôı(v)ô2 (w), which is non- 
degenerate if 5; and 62 are not the 0-functional. 


Example If F is a field and if k and n are positive integers, then each matrix 
A € Mkxn(F) defines a bilinear form in Bill(F* x F”) by (v, w) > v O Aw. 


Example If F is a field of characteristic not equal to 2 and of V is a vector space 
over F, then any f € Bill(V x V) can be written as a sum of a symmetric bilinear 
form and a skew-symmetric bilinear form, namely f = fı + f2, where fi : (v, w) => 
si f w, w) + f(w, v)] and fy: (v, w) > FEF (w, w) — f(w, v)]. Moreover, this rep- 
resentation is unique, for if f = g1 + g2, where gı is symmetric and g2 is skew 
symmetric, then for each (v, w) € V x V we have f(v, w) + f(w, v) = 2g1(v, w) 
and f(v, w)— f (w, v) = 2g2(v, w), from which we deduce that g; = f; fori = 1, 2. 


If V and W are vector spaces over F of finite dimension k and n, respectively, 
then any bilinear form on V x W can be represented as in the previous example. In- 
deed, if we fix bases B = {v1,..., vg} for V and D = {w1, ..., wn} for W, then for 
any f € Bill(V x W) we define the matrix Tgp(f) = [f (vi, wj)] € Mkxn(F) and 
check that if v = T aivi and w = Dj bjwj, then f(v, w) =v O Tgp(f)w. 
Indeed, for fixed B and D, the function f + Tgp(f) is an isomorphism from 
Bill(V x W) to Mixn(F). 


Example Let F be a field and let V = F?. Consider the bases B = {fo ; 7 | | 


and D = fi] ‘ 1: || of V.If f € Bill(V x V) is given by f : (Hi ; [s] = 


(a + b)(c + d), then it is easy to verify that Tgg (f) = F | and Tpp(f) = 


[o 4} 
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Proposition 20.2 Let V and W be vector spaces finitely generated over a 
field F and having bases B = {v1,..., vx} and D = {w1,..., Wn}, respec- 
tively. Let C = {x1,...,xx¢}and E = {y,..., yn} also be bases for V and W, 
respectively, and let P = [pir] € Mkxk(F) and Q = [qjs] © Mnxn(F) be 
nonsingular matrices satisfying 


=P|: and : |=Q 


Xk Uk Yn Wn 


Then for f € Bill(V x W) we see that Tcp(f) = PTgp(f) Q". 


Proof As a direct consequence of the definitions, we see that 


k n k n 
f (xi, yj) = > Pir vr, Sai = 5 X Pir f (vr, Ws)q js, 
1 s=1 


T= S ral s=1 


and this is precisely the (i, j)th-entry of PTgp(f) Q7. 


In particular, we see that if f € Bill(V x V), where V is a vector space of fi- 
nite dimension n over a field F, and if B and D are bases of V, then there exists a 
nonsingular matrix P € My xn(F) satisfying Tpp(f) = PTgg( f) PT. In general, 
matrices A and C in Mpnxn(F) are congruent if and only if there exists a nonsin- 
gular matrix P € Mnxn(F) satisfying C = PAP’. Congruence is easily checked 
to be an equivalence relation on Mnxn(F), which joins the relations of equivalence 
and similarity, that we have already defined. Congruent matrices clearly have the 
same rank, so that the rank of a matrix of the form Tgg(f) depends only on f and 
not on the choice of basis B. Therefore, we call this the rank of the bilinear form f. 
Thus, for example, the bilinear forms in Bill(V x V) of rank 1 are precisely those 
of the form (v, w) œ> a (v) (w), where a, B € D(V). 

A matrix congruent to a symmetric matrix is again symmetric. Indeed, if 
AE Mnzxn(F) is symmetric, then for any nonsingular matrix P we have 
(PAP!)? = PTT AT PT = PAP". 


1 -6 -6 
Example The matrix A = | —6 40 39 | € M3,3(R) is congruent to 7, since 
39 39 
0 0 
PAP! =I, where P=| 3 5 0 
aes 


As was the case with inner products, we can define orthogonality with respect 
to an arbitrary bilinear form. This concept has important applications when we are 
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working over fields other than R or C, and especially in areas such as algebraic 
coding theory, where all of the work is done over finite fields. Let V be a vector 
space over a field F and let f € Bill(V x V). Vectors v, w € V are f-orthogonal if 
and only if f (v, w) = 0. In this case, we will write v Lf w. (One has to be careful 
here, it may be true that v Lp w but false that w Lp v; this will not happen, of 
course, if f is symmetric.) If A is anonempty subset of V, then we can talk about the 
right f-orthogonal complement of A to be the set AH = {w€ V |v ty w for all 
v € A}. Complements of this form may behave very differently than complements 
defined by inner products, as the following example shows. 


Example Let F = GF(2) and let V = F*. Define f € Bill(V x V) by setting 


0 1 1 0 
f(v, w) =v © w. Then W = : A i ; A A a is a subspace of V 
0 1 (0) 1 


which satisfies Wtf = W. 

We note that V+ is trivial if and only if for any Oy # w € V there exists a 
vector v € V satisfying f(v, w) Æ 0. This condition is not a consequence of our 
definitions, and we must explicitly state it when we need it. It holds, of course, if f 
is nondegenerate. 


1 1 
Example Let V=R i ; C R4. If f € Bill(V x V) is defined by 
1 —1 
a by 0 
f: i , s +> aibi +a2b2 +a3b3 — a4b4, then A € V+/ and, indeed, 
3 3 
a4 b4 1 
1 0 
—1 0 
tre 
v i O};’] 1 
0 1 


Proposition 20.3 Let V be a vector space over a field F, and let f € 
Bill(V x V). If A is a nonempty subset of V then: 

(1) A-7 is a subspace of V; 

(2) AH =(FA)*S; 

(3) If ACB then BH C AM. 

Moreover, if {A; | i € Q} is a collection of nonempty subsets of V, then 


L 
(Vier Ai) tt = Nica A; E 
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Proof The proof of (1)—(3) is an immediate consequence of the definitions. To prove 
the last statement, we note that if w € V, then w € (U; cQ Ai) if and only if for 


each į € Q and each v € A; we have f(v, w) = 0. This is true if and only if w € Avs 


for each i € Q, namely if and only if w € [Jc ar, 


Proposition 20.4 Let V be a vector space finitely generated over a field F 
and let f € Bill(V x V) satisfy the condition that V+ is trivial. Then each 
subspace W of V satisfies the following conditions: 

(1) If6 € D(W) there exists a v € V such that 6(w) = f (v, w) for all w € W; 
(2) dim(W) + dim(W+/) = dim(V). 


Proof (1) Every vector v € V defines a linear functional 5, € D(V) by setting 
dy: y œ> f(y, v). Moreover, the function v +> ô, from V to D(V) is a linear trans- 
formation, which is a monomorphism as a result of the condition that V+ is trivial. 
But dim(V) = dim(D(V)) since V is finitely generated, and hence this is an iso- 
morphism. Now let ô € D(W) and let Y be a complement of W in V. Then the 
function from V to F given by w + y > ô(w) belongs to D(V) and so there exists 
a vector v € V such that it equals 6,. In particular, ô (w) = f (v, w) for all w € W, 
proving (1). 

(2) The function from V to D(W) which assigns to each v € V the restriction of 
ôy to W is a linear transformation which, by (1), is an epimorphism. The kernel of 
this epimorphism consists of all vectors v € V satisfying f (w, v) =0 forall w € W, 
and that is precisely W+/. Therefore, by Proposition 6.10, we have (2). 


In particular, we see from Proposition 20.4 that a necessary and sufficient condi- 
tion for us to have V = W @ W4 is that W and WS be disjoint. 


Proposition 20.5 Let V and W be vector spaces finitely generated over a 
field F and let f € Bill(V x W) be a bilinear form which is not the 0-function. 
Then there exist bases {v,,..., vg} and {w1,..., Wn} of V and W, respec- 
tively, and there exists a positive integer 1 < t < min{k, n} such that 


1 fi= j<t, 
0 otherwise. 


fou) ={ 


Proof Since f is not the 0-function, there exist vectors vı € V and yı € W such that 
Ff (v1, y1) Æ 0. Therefore, if we set w1 = f (v1, yi)! y, we have f (v1, w1) = 1. Let 
Vı = Fv, and W; = Fu). If we set W2 = {w E€ W | f(v1, w) = 0}, then W1 N W2 = 
{Ow} since cw, ¢ Wo for all 0 Æ c e F. We claim that W = W; ® W2. Indeed, if 
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w € W and if c= f (v1, w) then we see that 


fi, w — cwi) = f(v, w) —cf(v1, wi) =c —c=0 


and so w — cw, € W2, which proves the claim. In a similar way, we have 
V = V1 È V2, where V = {v € V | f(v, w1) = 0}. Thus we see that f(v, w) = 0 
whenever (v, w) € [V1 x W2] U [V2 x Wy]. 

By passing to the oppose form if necessary, we can assume without loss of gener- 
ality that k < n. If k = 1, we choose {vj} as a basis for V and {w1, ..., wy} as a basis 
for W, where {w2,..., Wn} is an arbitrary basis for W2. This proves the proposition, 
with tf = 1. Now assume that k > 1 (which implies n > 1) and that the proposition 
has been proven whenever dim(V) < k. In particular, we will look at the restriction 


of f to V2 x W2. By the induction hypothesis, there exist bases {v2,..., vg} of V2 
and {w2,..., Wn} of W2 such that 
1 if2<i=j <t, 
FQ, wj) = to otherwise. 


Then {v1,..., vg} and {w 1, ..., Wn} are the bases we want. 


We see that if V and W are vector spaces finitely generated over a field F and if 
f € Bill(V x W), then Proposition 20.5 says that there exist bases of V and W with 
I O 
O of 

We will be particularly interested in symmetric bilinear forms. As an immediate 
consequence of the definition, we see that if V is a vector space finitely generated 
over a field F and if B is a given basis for V, then a bilinear form f € Bill(V x V) 
is symmetric if and only if the matrix Tg, (f) is symmetric. Moreover, every sym- 
metric matrix is Tgg (f) for some symmetric bilinear form f € Bill(V x V). 


respect to which f is represented by a matrix of the form 


1 -5 
Example Let B be the canonical basis of R? and let A = | —5 1 
3 7 


. Then 


NW 


A=Tpp(f), where f € Bill(R* x R?) is defined by 


al bi 
fila |, |b = a,b, +a2b2 — 5(aıb2 + a2b1) 
a3 b3 


+ 3(a1b3 + a3b1) + 7(a2b3 + a3b2) + 4a3b3. 


Proposition 20.6 Let F be a set of characteristic other than 2 and let V be 
a vector space finitely-generated over F. Let f € Bill(V x V) be symmet- 
ric. Then there exists a basis B = {v1,..., Un} of V such that Tgg(f) isa 
diagonal matrix. 
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Proof The proposition is trivially true if f is the 0-function, and so we can assume 
that is not the case. We will proceed by induction on n = dim(V). For n = 1, the 
result is again immediate, and so we can assume that n > | and that the result has 
been established for all spaces having dimension less than n. We first claim is that 
there exists a vector v € V satisfying f(v, v) Æ 0. Indeed, assume that this is not the 
case. Then if v and w are arbitrary vectors in V we have 


0= f(v+w,v+w)= f(v, v) +2f(v, w) + fu, w) =2f(v, w) 


and since the characteristic of F is not 2, this implies that f (v, w) = 0, contradicting 
our assumption that f is not the 0-function. Hence we can select a vector vı € V 
satisfying f (v1, v1) 40. 

Let Vı = Fv, and let V2 = v f From the definition of Vı it is clear that Vj 
and VV are disjoint, and from Proposition 20.3 it follows that V = V; ® V2. In 
particular, dim(V2) = n — 1 and so, by the induction hypothesis, there exists a 
basis C = {v2,..., Un} of V2, such that, if f2 is the restriction of f to V2, then 
Tcc(f2) is a diagonal matrix. Since f(v1, vi) = 0 for all 2 <i < n, it follows that 
B= {v1,..., Un} does indeed give us the desired result. 


Thus we see that every symmetric matrix over a field of characteristic other than 
2 is congruent to a diagonal matrix. 


Proposition 20.7 Let V be a vector space finitely-generated over C and let 
f € Bill(V x V) be a symmetric bilinear form of rank r. Then there exists a 


basis B = {v1,..., Un} of V satisfying the following conditions: 
(1) Tgg(f) is a diagonal matrix; 

1 ifl<i<r, 
2 fn w= k otherwise. 


Proof By Proposition 20.6, we know that there is a basis B = {v1,..., Un} of V 
satisfying the condition that Tgg(f) is a diagonal matrix. This matrix is of rank r 
and so, renumbering the basis elements if necessary, we can assume that f (vi, vi) Æ 
0 when and only when | <i <r. For each | <i <r, define c; = f (vi, vj) i EC, 
and replace v; by c;v; to get a basis satisfying (2) as well. 


Let V be a vector space finitely-generated over a field F of characteristic other 
than 2 and let f € Bill(V x V) be a bilinear form. The function q : V > F de- 
fined by q : ut f(v, v) is called the quadratic form defined by f. Note that if 
a € F and v€ V then q(av) = f(av, av) = a’ f (v, v) = a*q(v). Moreover, if 
f €Bill(V x V) andif g € Bill(V x V) is the symmetric bilinear form g : (v, w) œ> 
sf (v, w) + f(w, v)] then the quadratic forms defined by f and g are the same. 
Therefore, without loss of generality, we will always assume that all quadratic forms 
over such fields are defined by symmetric bilinear forms. We further see that dif- 
ferent symmetric bilinear forms define different quadratic forms, since, for any 
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v, w E€ V, we have f(v, w) = siq(v + w) — q(v) — q(w)]. The classification of 
quadratic forms is of great importance in analytic geometry and in number theory. 


With kind permission of the Archives of the Mathematisches 
Forschungsinstitut Oberwolfach (Witt). 

The theory of quadratic forms over R was developed 
by Gauss and his student Eisenstein, and the need 
to study such forms was one of the factors which 
led to the development of determinant theory. Their 
work was extended to quadratic forms over C by 
the nineteenth-century British mathematician Henry 
Smith. The fundamental development in the theory of symmetric bilinear forms on vec- 
tor spaces over fields of characteristic other than 2 is due to the twentieth-century German 
mathematician Ernst Witt. 


Let V be a vector space over R. A quadratic form q : V > R is positive if and 
only if g(v) > 0 for all Oy Ave V. If q : V > Ris a positive quadratic form de- 
fined by a symmetric bilinear form f € Bill(V x V), then f must be nondegenerate. 
Indeed, if Oy Æ v € V then f(v, v) 40. 


Example If V is the vector space of all polynomial functions from R to itself, 
then we have a symmetric bilinear form from V x V to R defined by (f, g) => 
IH f(t)e(t)dt, which in turn defines the positive quadratic form f —> I} f(t)? dt. 


Example Let V = R* and let f € Bill(V x V) be the symmetric bilinear form 


ay by 
az bz . 
aa lb > abı + a2b2 + a3b3 — a4b4, which lies at the center of 
3 3 
a4 ba 
Minkowski’s mathematical formulation of Einstein’s relativity theory. The quadratic 
ay 
form defined by this bilinear form is ja a a? + aĝ + a? — aZ. A similar symmet- 
3 3 
a4 
al bı 
Paa b 
ric bilinear form is the Lorentz form a , i > ayby + anb2 + a3b3 — 
3 3 
a4 b4 
c?a4b4, where c is the speed of light. The quadratic form defined by this bilinear 
ai 
a 252 


form is > a? +a? t+az—c aĵ. 


a4 
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With kind permission of the Museum Boerhaave Leiden. 


The Dutch physicist Hendrick Antoon Lorentz, the first to conceive 
of the notion of the electron, won a Nobel prize in 1902. His work 
formed a basis for much of Einstein’s theory. 


A more general result, also based on the work of Lorentz and Minkowski, gives 
a fascinating “reversal” of the Cauchy—Schwarz—Bunyakovski inequality. Let n be a 
positive integer and consider the subset (not subspace) U of R”*! consisting of all 


a ; . ; 
vectors of the form | vP where a is a nonnegative real number and v € R” satisfies 


|v|| < a. For u = [s] and y = [i] in U, let us define u L] y to be ab — v- w. By 


our assumption on U, we note at u L] u > O for every u € U. Then one can show 
that u O y > [Vu O u]iVy O y]. This inequality is often known as the lightcone 
inequality because of its applications in physics. 


Example If V is an inner product space over R, then we have already noted that 
the function f : (v, w) b> (v, w) is a symmetric bilinear form. The quadratic form 
defined by f is given by v + ||v||?. This quadratic form is surely positive. The 
converse is also true. If f €e Bill(V x V) is a symmetric bilinear form defining a 
positive quadratic form, then f is an inner product on V, in the sense of Chap. 15. 


By Proposition 20.7, we see that if V is a vector space finitely generated over C 
andif f € Bill(V x V) is symmetric and has rank r, we can find a basis {v1,..., Un} 
of V such that the quadratic form q defined by f is given by q : )77_, aivi > 


visi a}. 


Example Let F be either R or C. Let n be a positive integer and let A E€ Mnxn (F) 
be symmetric. Let f € Bill(F” x F”) be the symmetric bilinear form given by 
f : (w, w) =v! Aw, and let q be the quadratic form defined by f. The set {q(v) | 
||v|| = 1} (here the norm is the one defined by the dot product on F”) is called the 
numerical range of the matrix A. In the case F = C, this is always a bounded convex 
subset which contains all of the eigenvalues of A. For the special case n = 2, this 
set is an ellipse with its foci at the eigenvalues of A, assuming that they are distinct, 
or a circle with center at the sole eigenvalue of A, assuming that A has only one 
eigenvalue of multiplicity 2. For n > 2, the characterization of the numerical range 
is much more complicated. 
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Proposition 20.8 Let n be a positive integer and let A E€ My xn(R) be sym- 
metric. Let f € Bill(R” x R”) be the symmetric bilinear form given by 
f : (w, w) œ> vT Aw, and let q be the quadratic form defined by f. Let 
c1 > C2 > +--+ > Cy be the eigenvalues of A. Then the numerical range of A 
lies in the closed interval [cn, c1]. Moreover, both endpoints of this interval 
belong to the numerical range of A. 


Proof By Proposition 17.7, we know that there exists an orthonormal basis B = 
{v1,...,Un} of V consisting of eigenvectors of A. Moreover, if v € V then v = 
(v, viv; by Proposition 17.9, and so 1 = ||v||? = (v, v) = X% (v, vi)”. We 
also see that Av = )“7_,(v, vi) A (vi) = 077, ci (v, vj) v;. Thus v? Av = (v, Av) = 
Me c (v, vi). But c =e Wa) = Dja cil, vi) > ey w, v) 
= Cn. Therefore, the numerical range of A lies in the closed interval [c,,, c1]. 

If v is a normal eigenvector of A corresponding to cn, then vf Av = (v, Av) = 
(U, Cnv} = Cn (v, v) = cn, and similarly for the case of an eigenvector of A satisfying 
||v|| = 1 and corresponding to c1. 


In order to see the geometric significance of quadratic forms, let us recall that a 
general quadratic equation in three unknowns over R is one of the form 
(a11X? +a2X} + a33X3) + 2(aj2X 1X2 + a13X1X3 + 423X2X3) 
+bıXı +b2X2 +b3X3+c=0 
in which not all of the aj; are equal to 0. Such an equation can be writ- 


ten in the form f(v, v) + w -v + c= 0, where f € Bill(R*, R?) is the sym- 
metric bilinear form defined with respect to the canonical basis by the matrix 


a1 a12 43 by Xı 
A= |a an 423 |, where w= | b2 |, and where v= | X2 |. The graph of 
a3 a3 an b3 X3 


such an equation is a quadratic surface. The various quadratic surfaces in R? can 
then be classified by considering congruence classes of the matrices A, a task very 
important in analytic geometry. 

We will now return to the general case of bilinear transformations. Let F be a 
field, let V and W be vector spaces over F, and let G = FY*W) Then G is a 
subspace of FY*W having a basis {gy,w | (v, w) € V x W}, where 


fens 1 if(@',w)=(v,w), 
Su,w i (Vw) > {0 otherwise. 
Let H be the subspace of G generated by all functions of the form 


Evitu, w — Svj,w — Sv2,w» &v,wi+w — §v,w; T Sv,w2» Sav,w — 48v,w; 


Or = 8v,aw — 48v,w 
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for all v, vj, v2 E€ V, w, w1, w2 € W, anda E F. Let us pick a complement of H 
in G, and call it V & W. By Proposition 7.8, we know that V &® W is unique up 
to isomorphism. Let œ be the projection of G with image V @ W coming from the 
decomposition G = H ® (V ® W) and, for all v € V and w € W, denote æ (gv, w) 
by v @ w. Then B = {v @ w | (v, w) € V x W} is a generating set for V @ W. It 
is important to emphasize that the elements of V @ W are linear combinations of 
elements of B. In quantum physics, elements of V @ W x B, for suitable spaces V 
and W, are known as entangled tensors and these have important physical interpre- 
tations. Elements of B are known as simple tensors. 
If vj, v2 € V and w € W, then 


[vy + v2] 8 w — (vı 8 w) — (v2 w) = € (Evi +v2,w — 8v1,w — &v2,w) = OG 


and so [v] + v2] 8 w = (vı ® w) + (v2 & w). Similarly, if v € V and w1, w2 E€ W 
then v Q [w1 + w2] = v & wı + v Q w2. We also see that if v € V, w € W andc € F, 
then cv ®@ w = c(v ® w) = v & cw. The vector space V ® W is called the tensor 
product of V and W. 


With kind permission of the 
Archives of the Mathematisches 
Forschungsinstitut Oberwolfach 
(Chevalley). 

There are many equiva- 
lent definitions of the ten- 
sor product. The definition 
given here is due to the 
ogous French mathematician Claude Chevalley. The notion of a tensor was 
first introduced in differential calculus by the nineteenth-century Italian mathematicians 
Gregorio Ricci-Curbastro and Tullio Levi-Civita and became a central tool in relativity 
theory. 


From the definition of the tensor product, we see that the function tyw from 
V x W to V 8 W given by (v, w) + v & w is a bilinear transformation. This trans- 
formation has a very special significance, due to the following theorem, which al- 
lows us to move from bilinear transformations to linear transformations. 


Proposition 20.9 Let V, W, and Y be vector spaces over a field F. For each 
bilinear transformation f € Bill(V x W, Y) there exists a unique linear trans- 
formation a € Hom(V ® W, Y) satisfying f = atyw. 


Proof Given f € Bill(V x W, Y), there exists a linear transformation 6 € Hom(G, Y) 
defined on the elements of a basis of the space G defined above, given by the con- 
dition that £ : 8&v,w œ> f(v, w). Since f is a bilinear transformation, H C ker(6) 
and so we can define the linear transformation a €e Hom(V &® W, Y) by setting 
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a: >", aj[v; ® wi] > } ai f (vi, wi). This function is well-defined since if 
i alu; ® wi] = 7, bilvi Q wi] in V Q W then X} (ai — bi)8u w EH E 
ker(£). Therefore, we see that a (X `;— ai [vi 8 wi]) = a (X 7—] bi lvi @ w;]). Clearly, 
a is a linear transformation and satisfies f = æty w. 

We are left to prove uniqueness. Suppose that y € Hom(V &® W, Y) satisfies 
f =ytvw. In particular, a (v ® w) = y (v ® w) for all (v, w) € V x W. That is to 
say, a and y act identically on a generating set for V @ W and so, in particular, on 
a basis for V ® W contained in this generating set. Therefore, by Proposition 6.2, it 
follows that œ = y. 


The following proposition is very important, and is often used as a basis for the 
definition of the tensor product. 


Proposition 20.10 Jf V, W, and Y are vector spaces over a field F , then the 
vector spaces Hom(V ® W, Y) and Hom(V, Hom(W, Y)) are isomorphic. 


Proof The function Hom(V &® W, Y) > Bill(V x W, Y) defined by Br tyw is 
clearly a linear transformation, and from Proposition 20.8 it follows that this is an 
isomorphism. Therefore, the result follows from Proposition 20.1. 


Example Let V and W be vector spaces over a field F and let 6; € D(V) and 
62 € D(W) be linear functionals. Then there exists a bilinear form in Bill(V x W) 
defined by (v, w) > ô (v)ô2 (w). From Proposition 20.9, it follows that there exists 
a linear functional 6; ® 62 € D(V ® W) satisfying 5; ® ô? : Fa ailvi ® wi] > 
X; aiô1(vi)ô2 (wi). 


Example More generally, let V and W be vector spaces over a field F, let a be 
an endomorphism of V, and let 6 be an endomorphism of W. The function from 
V x W to V @ W defined by 


(v, w) > a (v) 8 p(w) 


is a bilinear transformation and so defines an endomorphism «œ @ £ of V @ W satis- 
fying a Q $ : } ;_ ailvi @ wil X}; aila (vi) 8 8(wi)]. 


By Proposition 5.13, we know that if V ® W is a vector space finitely-generated 
over a field F, then dim(V © W) = dim(V) + dim(W). We now prove the “multi- 
plicative” analog of this assertion for tensor products. 


Proposition 20.11 Let V and W be vector spaces finitely generated over 
a field F. Then V ® W is also finitely generated, and dim(V & W) = 
dim(V) dim(W). 
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Proof Let us choose bases {vj,..., ug} of V and {wy,...,w,} of W. Then 
for any v = ear aivi E V and w = D bjw; € W, we see that v w = 
par vin aibj(vi @ wj). Thus we see that {v; ® wj |1 <i <k and 1< j <n}is 
a generating set for V & W, showing that V @ W is finitely-generated. Moreover, by 
Proposition 20.10 and Proposition 14.8, we see that the dimension of V @ W is equal 
to the dimension of D(V ® W) and hence to the dimension of Hom(V, D(W)), and 
this is equal to the dimension of Hom(V, W), which is precisely dim(V) dim( W). 


In particular, we see that in the context of Proposition 20.10, the set {v; ® wj 
1<i<kand1 < j <n} isin fact a basis of V @ W. 


Example Let F be a field and let k and n be positive integers. Then, by Proposi- 
tion 20.11, we know that dim(F* @ F”) = kn = dim(Mxxn(F)), and so the vec- 
tor spaces F k @ F” and M«kxn(F) are isomorphic. Indeed, if we choose bases 
{v1,..., vk} of V and {wj,..., Wn} of W, then the function v; @ wj > vi A wj 
extends to an isomorphism between these two spaces. 


Example Let F be a field, let n be a positive integer, and let V be a vector 
space finitely generated over F and having a basis {v1,..., vg}. The dimension 
of the vector space Mnxn(V) over F is n*k. Consider the bilinear transforma- 
tion f : Mnxn(F) x V > Maxn(V) defined by ([a;i;j], v) + [aijv]. By Proposi- 
tion 20.9, we know that this bilinear transformation defines a linear transformation 
a: Mnxn (F) ® V > Mnxn(V) and it is clear that this is an epimorphism. But, by 
Proposition 20.10, we see that the dimension of Mnxn (F) ® V is also equal to n?k 
and so œ must be an isomorphism. 


Example Let F be a field and let k, n, s, and t be positive integers. Let f : 
Mkxn (F) x Msx¢(F) > Mksxnt(F) be the function defined by 


aB ... anB 
f:(A, BR : 
akıB ... AnB 


This is a bilinear transformation of vector spaces over F and so, by Proposition 20.8, 
it defines a linear transformation a: Mkgxn (F) 8 Msxt(F) > Misxnt(F) which, 
again, can be shown to be an isomorphism. In the literature, it is usual to write A ® B 
instead of f(A, B). This matrix is called the Kronecker product of the matrices A 
and B. Kronecker products are very important in matrix theory and its applications. 
It is easy to see that for all such matrices A and B we have (A ® B)? =A? Q B7. 
Moreover, if k = n and s =t and if A and B are nonsingular, then A ® B is nonsin- 
gular, and (A ® B)~'! = AT! @ B™!. We also note that if A and B are symmetric 
then so is A ® B. Furthermore, Cholesky or QR-factorizations of A ® B come im- 
mediately from the corresponding factorizations of A and B. 
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As an example of the use of Kronecker products, we note the following re- 
sult, established in the 1970s by the American mathematicians Michael Gauger 
and Christopher Byrnes: Let F be a field, let n be a positive integer, and let 
A, BE Mnxn(F). Let I be the multiplicative identity of Mnxn(F). Then the ma- 
trices A and B are similar if and only if they have the same characteristic polynomial 
and the n? x n? matrices AQ@I—1Q@A,BQI—-I1Q@B,andAQI—-1@Ball 
have the same rank. 

Because of the utility of Kronecker products, one can raise the following prob- 
lem: Given positive integers k, n, s, and t, and given C € Mggyn;(R), find matri- 
ces A E€ Mkxn(R) and B € Msxt(R) such that |C — A & B|| is minimal. Sev- 
eral algorithms have been developed for finding a solution to this problem, the 
first by the American computer scientists Charles Van Loan and Nikos Pitsia- 
nis. 

Let V, V’, W, and W’ be vector spaces over a field F. If a € Hom(V, V’) and 
B € Hom(W, W’) then we have a bilinear transformation V x W > V’ @ W’ de- 
fined by (v, w)  a(v) & B(w) and so, by Proposition 20.8, there exists a linear 
transformation from V @ W to V’ @ W’ satisfying v @ w > a(v) ® (w). We will 
denote this linear transformation by a ® £. 


Proposition 20.12 Let V, V’, W, and W' be vector spaces finitely generated 
over a field F. Any element of the space Hom(V ® W, V’ & W’) is of the 
form >~"_, &i ® Bj, where a; € Hom(V, V’) and pi € Hom(W, W’) for each 
1<i<n. 


Proof The function (a, B) +> a ® B from Hom(V, V’) x Hom(W, W’) to Hom(V ® 
W, V’ @ W’) is bilinear and so defines a linear transformation g : Hom(V, V’) @ 
Hom(W, W’) > Hom(V & W, V’ @ W’). We are done if we can show that ¢ is an 
isomorphism. By Propositions 8.1 and 20.11, we know that 


dim(Hom(V, V’) & Hom(W, W’)) = dim(Hom(V, V’)) dim(Hom(W, W’)) 
= dim(V) dim(V’) dim(W) dim(W’) 
= dim(V @ W) dim(V’ & W’) 
= dim(Hom(V @ W, V’@ W’)), 


and so it suffices to prove that y is a monomorphism. 

Indeed, assume that Yel a; ® Bi € ker(g), where the set {f1,..., By} is lin- 
early independent, and where none of the a; is the 0-function. Then Dia aiv) Q 
Bi(w) = Ov'gw for all v € V and all w € W. Pick v e V satisfying &œı (v) 4 Oy. 
By renumbering if necessary, we can assume that {œ]1 (v), ..., æg (v)} is a maximal 
linearly-independent subset of {a1(v),...,@,(v)}. Therefore, for each k < h < n 
there exists a scalar bj, not all of them being equal to 0, such that œp (v) = 


ae bpjaj(v) and so 
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k m k 
Ovew => ai(v) @fi(w)+ D> (Saaw) ® Bn(w) 


h=k+1 \j=1 


i=l 

k k m 

Y-ai(v) ® Bi(w) + awg ( 5 human) 
i=1 j=1 h=k+1 

k m 

X ai(v) ® ( w+ >> nmo) 

i=l 


h=k+1 


Since the set {œ1 (v), ..., æg (v)} is linearly independent, we must have 6;(w) + 
hake 2nj Bn(w) = Ow forall 1 <i <k andall w € W. Hence B+ Yop p44 bhj Bh 
is the O-function for all 1 <i < k, contradicting the assumption that the set 
{B1,.--, Bn} is linearly independent. We therefore conclude that ker(¢) is trivial, 
which is what we needed to prove. 


Proposition 20.13 Jf U, V, and W are vector spaces over a field F, then 


U®VEW)=UBV)@OW. 


Proof The bilinear transformation U x (V @W) — (U8 V)& W defined by (u, v® 
w) > (u&v)& w induces a linear transformation a: U 8 (V 8 W) > (U®@V)@W 
which satisfies u ® (v w) > (u & v) & w. Similarly, we have a linear transforma- 
tion f: (U8 V)& W > U @(V & W) which satisfies (u 8 v) 8 w > u ® (v & w). 
Since œf and Ba are clearly the respective identity maps, we see that a must be the 
isomorphism we seek. 


Proposition 20.14 If V and W are vector spaces over a field F, then 
VOWEWOV. 


Proof The bilinear transformation V x W > W & V defined by (v, w)=> w @ v 
induces a linear transformation a from V @ W to W ® V satisfying a: v @wrh 
w & v. Similarly, there exists a linear transformation £ : W & V —> V &® W satisfying 
B:w@vr v& w. Since wf and Ba are clearly the respective identity maps, we 
see that œ must be the isomorphism we seek. 


Finally, let us briefly mention two algebras built on the notion of the tensor prod- 
uct. The study of these algebras is beyond the scope of this book. However, the 
reader should be aware of them and will find it fruitful to explore them further. In 
what ensues, V is an arbitrary vector space over a field F. 
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(I) For each nonnegative integer k, we define the vector space V®* over F by 
setting V8? = V and V® = VƏD Q V ifk>0. Let T(V) =[[?29 VO. We 
can define a product e on T(V) by setting (vı ® --- @ ug) è (Ug41 @ -Q Vy) = 
vi -Q Vk+m for all v1,..., Uk+m E V and extend linearly. This is an F'-algebra, 
known as the tensor algebra of V over F. The tensor algebra has several impor- 
tant properties, one of which is that if K is any algebra over F then any linear 
transformation a: V — K can be uniquely extended to a homomorphism of F- 
algebras from T(V) to K. Moreover, if W is a vector space over F then any lin- 
ear transformation œ : V —> W can be uniquely extended to a homomorphism of 
F-algebras from T(V) to T(W). (In the language of category theory, this says that 
T(-) is a functor from the category of vector spaces over F to the category of 
F-algebras.) 

(II) Let Y be the subspace of V @ V generated by {v @v | v e V}. Then a 
complement of Y in V @ V is called an exterior square of V and is denoted 
by V A V. This space is unique up to isomorphism. If œ is the projection of 
V ® V with image V A V and kernel Y, denote a(v ® w) by v A w. Since 
(v+tw)®(v+w)=v@v+vGw+wGvt+w® w forall v, w € V, we see that 
v A w = —w Av forall v, w € V. Therefore, if V is finitely-generated over F with 
basis {v1,..., Un}, we see that {vj A vj |1 <i < j <n} is a basis for V ^ V, and 
hence dim(V ^ V) = (5) = n(n — 1). This construction can be iterated to more 
than two factors. If k > 0 is an integer, we can consider the subspace Y of V®* gen- 
erated by all expressions of the form v1 ® --- & vx in which v; = vj for some i Æ j. 
A complement of Y is denoted by A V and is called the kth exterior power of V. 
If V has finite dimension n, then dim( AÝ V) = (%). In particular, we note that NV 
is trivial when k > n. The subspace A (V) = UA V) of T(V) is known as the 
exterior algebra of V , and has important applications in geometry and cohomology 
theory. One can show that if (K, e) is a unital F-algebra and if œ : V —> K is a lin- 
ear transformation satisfying the condition that œ (v) e æ (v) = Og for all v € V, then 
œ can be uniquely extended to a homomorphism of unital F-algebras from A (V) 
to K. 


Exercises 


Exercise 1176 

Let n be a positive integer and let V be the space of all polynomials in C[X] 
of degree at most n. For p(X) = X a; X! and q(X) = Ẹbi Xİ in V, we de- 
fine the nth Bézout matrix Bez, (f, 8) € Mnxx(C) defined by f and g as fol- 
lows: Bez, (f, g) = [cij], where cij = la +k—-1Dj—k — ai—kbj+k—1] and 
m(i, j) = min{i,n + 1 — j}. Show that the function Bez, : V x V > Mnxx (C) 
is a bilinear transformation satisfying the conditions that Bez, (f, f) = O for all 
f €V.Ifn=max{deg(f), deg(g)}, show that Bez, (f, g) is nonsingular if and 
only if f and g have no common roots. 
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Etienne Bézout was an eighteenth-century French mathematician. 


Exercise 1177 


l a a 1 —1 -l 
Find a € R such that the matrices | 0 1 a | and | —1 1 —1 | define 
0 0 1 —] -l1 1 


the same bilinear form in Bill(R?, R3). 


Exercise 1178 
Let V = QQ. Is the function from V x V to Q given by (f,g)t> (f + g)(5) . 
(f — g)(2) a bilinear form? 


Exercise 1179 
Let F be a field and let u : N x N > F be an arbitrary function. Is the function 
fu: FLX] x F[X]— F[X] defined by 


qe (Sax. $ex) = 2 3 ui awya) 
j= 


i=0 k=0 \i+j=k 


a bilinear transformation? 

Exercise 1180 

Let B be the canonical basis for the vector space V = R?. Find a bilinear form 
f € Bill(V x V) satisfying the condition Tgg( f) = f È : 


Exercise 1181 
Let B be the canonical basis for R*. Find Tepe (f), where f: R? x R? > Ris 
the bilinear form defined by 


f: b|,| b > aa’ + 2be' + cc! + 2cb’ — ab’ + bb’ — ba’. 
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Exercise 1182 
Let f : R? x R? > R be the bilinear form defined by 


0 1 0 
f:@,w)=>v-|1 0 2] w. 
0 1 1 


1 0 1 
Find the matrix representing f with respect to the basis O},) 1], 
0 1 1 


of R?. 


Exercise 1183 

Let V and W be vector spaces over a field F and let œ €e Hom(V, W). For each 
g € Bill(W x W), let us define the bilinear form gy € Bill(V x V) by setting 
8a: (v, v’) > g(a(v), a(v’)). Is the function g > gq a linear transformation? 


Exercise 1184 

Let F be a field of characteristic other than 2 and let V be a vector space over F. 
Let f € Bill(V x V). Show that f(v, v) Æ 0 for all Oy Æ v € V if and only if 
for every nontrivial subspace W of V and for every Oy 4 w €e W there exists a 
vector w’ € W satisfying f(w, w^) 40. 


Exercise 1185 
Show that if V and W are vector spaces finitely generated over a field F of 
unequal dimensions, then there is no nondegenerate f € Bill(V x W). 


Exercise 1186 
Let F be a field of characteristic 0 and let the bilinear form f € Bill(F 3 x F?) be 


ti 


a a 
defined by f : b|,| b > aa’ +bb' — cc’. Is there a nontrivial subspace 
c c 


W of V satisfying f(w, w’) =0 for all w, w’ € W? 


Exercise 1187 
Let f € Bill(R* x R*) be defined by f : (v, w) > v- (Aw), where 


0 
0 
a= 1 
0 


= oo Oo 


1 0 
0 1 
0 0 
0 0 


Find a basis {v1, v2, v3, va} of R* satisfying the condition that f(v;, vj) = 0 for 
all 1 <i <4. 
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Exercise 1188 

Let f : Mnxn(F) x Mnxn(F) > F be the function defined by f : (A, B) > 
tr(AB), where F is a field and n is a positive integer. Is f a bilinear form? Is f 
symmetric? 


Exercise 1189 
Let n be a positive integer and let f : Mnxn(C) x Mnxn(C) — C be the func- 
tion defined by f : (A, B)  n-tr(AB) —tr(A) tr(B). Show that f is a symmetric 
bilinear form. 


Exercise 1190 
Let V be a vector space over a field F and let f € Bill(V x V) be a symmetric 
bilinear form. Let Y = F x V and define an operation e on Y by setting 


Behe eee for all a,b € F and all v, w € V. 
v w aw + bv 


Show that (Y, e) is a Jordan algebra. 


Exercise 1191 
Let V be a vector space over Q. Is the function V x V > V defined by (v, v’) œ> 
v + v’ a bilinear transformation? 


Exercise 1192 


0 0 0 25 -5 35 
Are the matrices | 1 1 0 and 35 O0 —3 21 |in M3x3(R) congruent? 
1 1 1 0 —4 28 
Exercise 1193 
—2 
Find an upper-triangular matrix in M3x3(R) congruent to | —1 1 O | or 
— 4 


show that there is no such matrix. 


Exercise 1194 

Let F be a field and let n be a positive integer. A matrix A = [a;j] E€ Mnxn(F) is 
an upper Hessenberg matrix if and only if a;; = 0 whenever i — j > 2. Is every 
matrix in M,,(R) necessarily congruent to an upper Hessenberg matrix? 


© Brigitte Bossert. 


Karl Hessenberg was a twentieth-century German engineer. 
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Exercise 1195 
Let F be a field. Show that every upper triangular matrix in. M3x3(F) is congru- 
ent to a lower triangular matrix. 


Exercise 1196 
Let n be a positive integer and let A be a nonsingular symmetric matrix in 
Mnxn(C). Show that A is congruent to Aa}, 


Exercise 1197 


Find a matrix P € M3x3(R) such that the matrix P P’ is diagonal. 


We N 
=. Oe 
We W 


Exercise 1198 


1 1 1 1 
1 1 1 1 . : : 

Let A= 1111I/€ Ma 4(R). Find a nonsingular matrix P € M4x4(R) 
1 1 1 1 

such that PA PT is diagonal. 


Exercise 1199 
Find a diagonal matrix in M4x4(R) congruent to the matrix 


1 2 3 2 
2 3 4 8 
3 5 8 10 
2 8 10 -8 
Exercise 1200 
1 i 1+i 
Is the matrix i 0 2—i | €M3x3(C) congruent to I? 


1+i 2-i 10+4+2i 


Exercise 1201 

Let n be a positive integer and let œ be a positive-definite endomorphism of R” 
represented with respect to the canonical basis by the matrix A. If A’ is a matrix 
congruent to A, does it too represent a positive-definite endomorphism of R” 
with respect to the canonical basis? 


Exercise 1202 

Let V be a vector space finitely generated over be a field F of characteristic other 
than 2. If f € Bill(V x V) is symmetric and not the 0-function, show that there 
exists a vector v € V satisfying f(v, v) 40. 
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Exercise 1203 

Let V be a vector space finitely generated over be a field F of characteristic 
other than 2. If f € Bill(V x V), show that f(v, v) =0 for all v € V if and only 
if f(v, w) =—f(w, v) forall v,w E€ V. 


Exercise 1204 

Let n be a positive integer, let F be a field, and let A E€ Mnxn(F). Show that 
there exists a symmetric matrix B € Mnxn(F) satisfying v - Av = v - Bv for all 
ve F”. 


Exercise 1205 


Find a bilinear form f € Bill | R? x | R? | which defines the quadratic 


a 
form | b | > a? — 2ab + 4ac — 2be + 2c?. 
c 


Exercise 1206 
Let f € Bill(R?, R?) be the symmetric bilinear form defined by the matrix 


=3 1 0 
1 —6 1 |. Find the quadratic form defined by f. 
0 i 7 


Exercise 1207 
Let f € Bill(R?, R?) be the symmetric bilinear form defined by the matrix 


2 -1 5 
-1 l ; . Find the quadratic form defined by f. 
5 a =3 
3 


Exercise 1208 

Find a symmetric bilinear form f € Bill(R?, R3) which defines the quadratic 
a 

form | b | t+» 2ab + 4ac + 6bc. 
c 


Exercise 1209 

Let F be a field of characteristic other than 2, and let V be a vector space over F. 
Let q : V — F be a function satisfying the condition that q (v + w) +q (v — w) = 
2q (v) + 2q (w) for all v, w € V. Show that the function f : V x V > F defined 
by f : (v, w) > Ha (v + w) — q (v — w)] is a symmetric bilinear form. 
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Exercise 1210 

Let V be a vector space over a field F of characteristic other than 2, and let 
f € Bill(V x V) be a symmetric bilinear form which defines a quadratic form 
q : V — F. Show that 


qu +v+w)=q(u +v) +qu+w)+qv + w)-— qu) -— qw) -— qw) 
forall u,v,w eV. 


Exercise 1211 
Let V be a vector space over a field F. Show that V S F 8V. 


Exercise 1212 
Let V and W be vector spaces over a field F. Let x e V & W be written in the 
form x = J vi ® wi, where n is minimal in the sense that there is no way to 


express x in the form ys v; Q w; for any k < n. Show that {v1,..., Un} isa 
linearly-independent subset of V and that {w1, ..., wn} is a linearly-independent 
subset of W. 


Exercise 1213 
Let K be a field containing F as a subfield. If V is a vector space over F, show 
that K @ V is a vector space over K. 


Exercise 1214 

Let V be a vector space of finite dimension n over a field F and let Y be the 
subspace of V @ V generated by all elements of the form v Q v’ — v’ @ v, where 
v, v’ € V. Find the dimension of Y. 


Exercise 1215 

Let V and W be finite dimensional vector spaces over a field F. Let v, v’ € V 
and w, w’ € W be vectors satisfying the condition v @ w = v’ ® w’ and this is not 
the identity element of V © W with respect to addition. Show that there exists a 
scalar c € F such that v = cv’ and w’ = cw. 


Exercise 1216 

Let F be a field and, for all A, B € M2x2(F), denote the Kronecker product 
of A and B by A &® B. If {M,..., Ha} is the canonical basis for M2x2(F), is 
{Hi 8 Hj |1<i, j <4} a basis for M4x4(F). 


Exercise 1217 
Find the numerical range of the quadratic form q : R? > R defined by q : v > 
vT oe v 
0 0j 
Exercise 1218 


Let n be a positive integer and let F be a field. If A E€ Mnxn(F) is a magic 
matrix, is the same true for A @ A € Manx2n (F)? 
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Exercise 1219 

Let F be a field and let k and n be positive integers. If matrices A E€ Mxxx(F) 
and B € Myyn(F) have eigenvalues a and b, respectively, show that ab is an 
eigenvalue of A @ B. 


Exercise 1220 

Let F be a field and let k and n be positive integers. If matrices A E€ Mxxx(F) 
and B € Myxn(F) have eigenvalues a and b respectively, find a matrix 
C E€ Mknxkn(F) with eigenvalue a + b. 


Exercise 1221 

Let F be a field of characteristic other than 2 and let V be a vector space over F. 
Find the minimal polynomial of the endomorphism a of V @ V defined by 
a: Pai (vj @ wi) > X; aj (w; Q vi). 


Exercise 1222 

Let F be a field, let k,n,s, and t be positive integers, and consider matrices 
A € Mexn(F) and B € Ms x;(F). Is the rank of A ® B necessarily equal to the 
product of the ranks of A and B? 


Exercise 1223 

Let V, V’, W, W’ be vector spaces over a field F and let a: V > V’ and 
B:W-— W’ be monic linear transformations. Let œ ® £ be the linear trans- 
formation from V @ V’ to W @ W’ defined by a @ B: Y°"_, ai (vi Q v) => 
X; aila (vi) ® B(v{)]. Is « 8 B monic? 


Exercise 1224 

Let F be a field and let (K, e) and (L, *) be F-algebras. Define an operation © 
on V @ W by setting (v @ w) © (v' 8 w’) = (v è v’) Q (w x w’) for all v, v' € K 
and w, w’ € L. Is (K & L, ©) an F-algebra? 


Exercise 1225 


Let V = R? and let W = V ® V. If w € W is normal, do there necessarily exist 
normal vectors v, v’ € V such that w = v @ v’? 


Exercise 1226 
Let V be a vector space over R. Show that the complexification of V is isomor- 
phic to C@V. 


Exercise 1227 

Let V be an inner product space over R having a basis {v; | i € Q} and let W 
be an inner product space over R having a basis {w; | j € A} Define a function 
H: (V @W) x (V 8 W) > R by setting 
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u: (x a ajj (vj ® wj), 5 > bij (v; ® w) 


ieQ jeA ieQ jeA 
œ X} aijbij[(vi, v) + (wj, wi)]. 
iEQ jer 


Is u an inner product on V & W? 


Exercise 1228 

Let V be a vector space over a field F and let a € End(V). Is the function 
V AV > V AV defined by )7"_, ci (vi A wi) > OL, ci (æ(vi) A a(wi;)) a lin- 
ear transformation? 


Exercise 1229 
Let n be a positive integer and let A, B E€ Mnxn(R) be orthogonal matrices. Is 
their Kronecker product A &® B an orthogonal matrix? 


Exercise 1230 
Let n be a positive integer and let A, B € Mnxn(R) be permutation matrices. Is 
their Kronecker product A & B a permutation matrix? 


Exercise 1231 
Let k and n be positive integers and let F be a field. Let A € Mgxk(F) and let 
B € Mnxn(F). Is it necessarily true that tr(A ® B) = tr(A) tr(B)? 


Exercise 1232 
Let k and n be positive integers and let F be a field. For A € Mẹkgxk(F) and 
B € Mnxn(F), find a matrix C € Mknxkn(F) such that e =ef ge’. 


Exercise 1233 


The matrix plays an important part in quantum information the- 


O O Onm 
oroe 
ooroeo 
= OCC 


ory. Write this matrix as a sum of Kronecker products of | o | and the three 


Pauli matrices. 
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K 

Karatsuba’s algorithm, 45 
Kernel, 94 

Ket-bra product, 136 
Kovarik algorithm, 445 
Kronecker product, 467 
Krylov algorithm, 301 
Krylov subspace, 297 


L 
Lagrange identity, 339 


Lagrange interpolation polynomial, 162 


Lanczos algorithm, 301 
Laurent series, formal, 18 
Leading coefficient, 44 
Leading entry, 194 
Least squares method, 447 
Legendre polynomial, 370 
Lie algebra, 41 
general, 147 
special, 320 
Lie product, 42 
Lightcone inequality, 463 
Linear combination, 27 
Linear functional, 317 
Linear transformation, 89 
adjoint of, 383 
Linear variety, 96 
Linearly dependent, 57 
locally, 92 
Linearly independent, 57, 110 
nearly, 82 


with respect to a fuzzification, 87 


Linearly recurrent sequence, 298 
List, 1 

Locally linearly dependent, 92 
Loewner partial order, 404 
Lorentz form, 462 
Lower-triangular matrix, 150 
LU-decomposition, 168 


M 

Magic matrix, 287 

Markov matrix, 150 

Matrices 
column-equivalent, 161 
congruent, 457 
equivalent, 161 


Pauli, 82 

row-equivalent, 161 

similar, 266 

unitarily similar, 420 
Matrix, 23 


adjacency, 399 


adjoint, 235 
anti-Hermitian, 413 
band, 149 
change-of-basis, 162 
circulant, 172 
coefficient, 191 
companion, 265 
complex Hadamard, 388 
determinant of, 227 
diagonal, 148 

distance, 400 
elementary, 153 
extended coefficient, 191 
Givens rotation, 420 
Google, 282 

Gram, 336 

Hadamard, 229 

Hankel, 241 

Hermitian, 396 

Hilbert, 159 
Householder, 423 

in block form, 138 

in reduced row echelon form, 194 
in row echelon form, 193 
involutory, 151 

Jacobi reflection, 422 
lower-triangular, 150 
magic, 287 

Markov, 150 
Nievergelt’s, 159 
nonsingular, 151 
normal, 434 

orthogonal, 422 
permutation, 158 
quasidefinite, 415 

scalar, 148 

singular, 151 
skew-symmetric, 150 
sparse, 167 

special orthogonal, 424 
stochastic, 150 

strictly diagonally dominant, 249 
symmetric, 150 
symmetric Toeplitz, 202 
symplectic, 440 
transpose, 95 
tridiagonal, 149 
unitary, 419 

upper Hessenberg, 473 
upper-triangular, 149 
Vandermonde, 163 
zero, 24 


Matrix logarithm, 356 
Matrix representation, 153 
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Maximal, 61 
Maximal subspace, 325 
Method of condensation, 230 
Minimal, 61 
Minimal polynomial, 273, 298 
Minkowski’s inequality, 339 
Minor, 230 
Modular Law, 31 
Moebius function, 48 
Monic 

function, 2 

polynomial, 44 
Monomorphism, 95 
Moore-Penrose pseudoinverse, 441 
Multiplication 

in a field, 5 

scalar, 21 
Multiplication table, 65 
Multiplicity 

algebraic, 270 

geometric, 270 
Mutually orthogonal, 369 


N 
Nearly linearly independent, 82 
Nevanlinna—Pick Interpolation Theorem, 403 
Nievergelt’s matrix, 159 
Nilpotent, 302 
Nondegenerate bilinear form, 455 
Nonderogatory endomorphism, 270 
Nonhomogeneous system of linear equations, 
190 
Nonsingular matrix, 151 
Nontrivial subspace, 25 
Norm, 338, 342 
cut, 345 
elliptic, 403 
Euclidean, 338 
Frobenius, 345 
Hamming, 352 
Hilbert-Schmidt, 345 
induced, 343 
spectral, 344 
triangular, 37 
Normal endomorphism, 424 
Normal matrix, 434 
Normal vector, 338 
Normed space, 342 
Norms 
equivalent, 346 
Nullity, 98 
Number 
complex, 7 
rational, 5 
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real, 5 
Numerical range, 463 


(0) 
Odd function, 76 
Odd permutation, 224 
Operation 
elementary, 158 
Opposite transformation, 453 
Optimization algebra, 9 
Order of recurrence, 298 
Orderable field, 18 
Orthogonal, 369 
Orthogonal complement, 374 
right, 458 
Orthogonal matrix, 422 
Orthogonal projection, 374 
Orthogonality with respect to a bilinear form, 
458 
Orthogonally diagonalizable, 398 
Orthonormal, 375 


P 
Padé approximant, 239 
Pairwise disjoint, 27 
Parallelogram law, 339 
Parity, 456 
Parseval’s identity, 380 
Partial order, 60 
Loewner, 404 
Partial pivoting, 168 
Partially-ordered set, 60 
Pauli matrices, 82 
Periodic function, 69 
Permanent, 252 
Permutation, 2 
even, 224 
odd, 224 
Permutation matrix, 158 
Pfaffian, 228 
Piecewise constant, 34 
Pivot, 168 
Pivoting 
full, 168 
partial, 168 
Poincaré—Birkhoff—Witt Theorem, 42 
Polar decomposition, 431 
Polarization identity 
complex, 363 
real, 363 
Polynomial, 44 
characteristic, 264, 298 
Chebyshev, 370 
completely reducible, 49 


Index 


Polynomial (cont.) 

cyclotomic, 48 

flat, 51 

in several indeterminates, 50 

irreducible, 47 

Jacobi, 371 

Lagrange interpolation, 162 

Legendre, 370 

minimal, 273, 298 

monic, 44 

reciprocal, 440 

reducible, 47 

trigonometric, 56 

zero, 44 
Polynomial function, 47 
Positive definite, 403 
Positive quadratic form, 462 
Positive semidefinite, 403 
Power 

exterior, 470 
Pre-Banach space, 342 
Pre-Hilbert space, 333 
Primitive root of unity, 152 
Principal component analysis, 118 
Process 

Arnoldi, 375 

Gram-Schmidt, 372 
Product 

bra-ket, 136 

Cartesian, 2 

cross, 42 

direct, 23 

dot, 334 

dyadic, 145 

exterior, 136 

Hadamard, 174 

inner, 333 

interior, 136 

Jordan, 43 

ket-bra, 136 

Kronecker, 467 

Lie, 42 

scalar triple, 339 

Schur, 174 

tensor, 465 

vector triple, 339 
Projection, 118 

onto an affine set, 387 

orthogonal, 374 
Proper subspace, 25 
Pseudoinverse 

Drazin, 450 

Moore-Penrose, 441 


Q 
QR algorithm, 380 
QR-decomposition, 379 
Quadratic form, 461 
positive, 462 
Quadratic surface, 464 
Quasidefinite matrix, 415 
Quaternion 
algebra, 66 
real, 66 
QZ algorithm, 380 


R 

Range, 2 
numerical, 463 

Rank, 98, 199, 457 


Rational Decomposition Theorem, 304 


Rational number, 5 
Rayleigh quotient 

function, 400 

iteration scheme, 400 
Real Euclidean, 333 
Real number, 5 
Real part, 7 
Real polarization identity, 363 
Real quaternion, 66 
Reciprocal polynomial, 440 


Reduced row echelon form, 194 


Reducible, 47 
Relation 
equivalence, 120 
partial order, 60 
Relaxation method, 207 
Representation, 153 
Restriction, 2 


Riesz Representation Theorem, 382 
Right orthogonal complement, 458 


Row echelon form, 193 
Row equivalent, 161 
Row space, 199 


S 

Scalar, 22 

Scalar matrix, 148 

Scalar multiplication, 21 
Scalar triple product, 339 
Schur complement, 160 
Schur product, 174 
Schur’s Theorem, 421 
Selfadjoint, 395 
Semifield, 9 

Semisimple eigenvalue, 270 
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Sequence, | 
Fibonacci, 299 
linearly recurrent, 298 
Set 
difference, 2 
generating, 28 
partially-ordered, 60 
spanning, 28 
Sherman—Morrison—Woodbury Theorem, 154 
Signum, 224 
Similar matrices, 266 
Simple eigenvalue, 270 
Simple tensors, 465 
Simpson’s rule, 187 
Singular matrix, 151 
Singular value, 432 
Singular Value Decomposition Theorem, 431 
Skew symmetric bilinear form, 453 
Skew symmetric matrix, 150 
Solution set, 191 
Solution space, 191 
SOR, 207 
Space 
dual, 317 
inner product, 333 
normed, 342 
pre-Banach, 342 
pre-Hilbert, 333 
solution, 191 
Spanning set, 28 
Sparse matrix, 167 
Special Lie algebra, 320 
Special orthogonal matrix, 424 
Spectral condition number, 432 
Spectral Decomposition Theorem, 429 
Spectral norm, 344 
Spectral radius, 258 
Spectrum, 255 
Spline function, 56 
Stabilizer, 131 
Standard identity, 238 
Stationary iteration method, 208 
Steinitz Replacement Property, 60 
Stochastic matrix, 150 
Strassen—Winograd algorithm, 166 
Strictly diagonally dominant, 249 
Subalgebra, 41 
unital, 41 
Subfield, 6 
Euclidean, 333 
real Euclidean, 333 
Subset 
affine, 96 
bounded, 67 


chain, 61 
convex, 412 
Hilbert, 376 
orthonormal, 375 
underlying, 1 


Subspace, 25 


cyclic, 117 
fuzzy, 37 
generated by, 28 
improper, 25 
invariant, 117 
Krylov, 297 
maximal, 325 
nontrivial, 25 
proper, 25 
spanned by, 28 
trivial, 25 


Subspaces 


disjoint, 27 
independent, 74 


pairwise disjoint, 27 
Successive overrelaxation method, 207 
Sylvester’s Theorem, 98 


Symmetric 


bilinear transformation, 453 


matrix, 150 


Toeplitz matrix, 202 
with respect to an involution, 385 
Symmetric difference, 24 


Symplectic matrix, 440 


System of linear equations, 189 


T 


homogeneous, 190 


nonhomogeneous, 190 


Taber’s Theorem, 321 
Taylor coefficient, 117 
Tensor algebra, 470 
Tensor product, 465 
Tensors 


entangled, 465 
simple, 465 


Ternary ring, 18 
Trace, 318 
Transcendental, 73 
Transform 


discrete cosine, 152 


discrete Fourier, 152, 341 


fast Fourier, 152 


Transformation 


affine, 96 
bilinear, 453 
linear, 89 
opposite, 453 
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Transpose, 95 

conjugate, 334 

Hermitian, 334 
Triangle difference inequality, 339 
Triangle inequality, 353 
Triangular norm, 37 
Tridiagonal matrix, 149 
Trigonometric polynomial, 56 
Trivial subspace, 25 


nderlying subset, 1 

niform combination, 38 
niform hull, 38 

nion, 1 

nit, 40 

nital algebra, 39 

nital subalgebra, 41 

nitarily similar matrices, 420 
nitary automorphism, 419 
nitary matrix, 419 

pper Hessenberg matrix, 473 
pper-triangular matrix, 149 
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Vv 
Vandermonde matrix, 163 
Variety 
linear, 96 
Vector, 22 
normal, 338 
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Vector addition, 21 
Vector space, 21 
finite dimensional, 71 
finitely generated, 29 
infinite dimensional, 71 
Vector triple product, 339 
Vectors 
orthogonal, 369 
orthogonal with respect to a bilinear form, 
458 


Ww 
Wavelet 

Haar, 376 
Weak dual space, 323 
Weight function, 318 
Weight of a Baxter algebra, 90 
Weighted dot product, 335 
Well Ordering Principle, 61 
Weyl’s Problem, 399 
Width of a band matrix, 149 
Wiedemann algorithm, 301 
Word, 23 
Wronskian, 228 


Z 

Zero-sum combination, 38 
Zero-sum hull, 38 
Zlobec’s formula, 445 
Zorn’s Lemma, 67 


